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construction. h th basic structure (see Figure 1), or multiple 

[0002] There are five classes of human weight approximately 50,000 d) 

• thereof, consisting of two identical polypept .des ca ch^^ J the five antibody classes has a 

and two identical light (L) chains (molecular ^^ISSSSim^ <* °" e variable and 0ne C ° nStant 
similar set of >ight chains and a distinct set o hea^ 

domain, while a heavy chain is composed of one variable _d I three > o determines the specificity 

domains of a paired light and heavy cha.n are known as the Fv region, or simply 
of the immunoglobulin, the constant 

[0003] Amino acid sequence data indicate that each varam . ao t> conserved framework 

ometimes called comp.ementarity ^^ t ^"^SZcS^^S Department Health and Human 
^ wW (|Wit*.S aaBBaMa ^gg^ J n assumed to be responsible 

BaccSTe^ 

propriate mouse myeloma cell line. rnn rpnt of taraetinq bioactive substances such as drugs, 

[0005] The literature contains a host of references to the "^"^j^^ or t0 induce a localized drug or 
oxins and enzymes to specific points in the body l ° ^^^^ substance to monoclonal 

enzymatic effect. It has been proposed to ach.eve frsef.ec by anf1 Therapv 0< Ca ncer , 1987. 

antibodies (see. e.g.. Vogel, ■'™»nooon^ anti- 

ulin sequences. Accordingly, hybrid antibody ™l«ulesteve been mposed w from ^ 

from different mammalian sources. The f^J^^^!^ ou me (Morrison et al. (1 984) Proc. Natl, 
mammalian source, and constant regions from human or ano her mammalia n ^ ^ ^ ^ 

sSoseoi sJS* P.C.T. application no. ^^^S^^-^^moN^^ 
[0007] Ithasbeenreportedthat^ 

at the amino terminal end of both the heavy and 'J^^T^Sm the native antibody molecule, and retain 
(as V H V L dimers, termed Fv regions) even after R"^^^!JSte lobar et al., Proc. Natl. Acad. Sci. U.S.A. 
Lftof th* antigen recognition and 1976) Biochem. 15:2706-2710; Sharon 

(1972) 69:2659-2662; Hochman et. al. (1973) B'ochem. 1|.1130 IK . andjw - Ehrlichetal . (198 0) 

andGU(1976)Biochem.15:1591-1594;Rosen^ 

Summar y of the Claimed Invention 

[ooo8] » «. - «. ^r»sis&t?~ s 

polypeptide chain C0>n^^ahni™9WU^«^^^ ™»™° „ 0 | >pep ,i de domain lo form a single 
o.nneou, a M and a seoond mHtfu ?<£*^JZ^£^ZL domains eonneefod by .be 



2 



EP 0 623 679 B1 

reactive side chains but no cysteine. The hydrophilic amino acids constitute a hydrophilic sequence which has a flexible 
unstructured configuration essentially free of.secondary structure in aqueous solution. The linking sequence contains 
a plurality of glycine or serine residues and spans the distance between the C-terminal end of the first domain and the 
N-terminal end of the second domain. 

5 [0009] The linking sequence of the polypeptide chain described in the previous paragraph may contain threonine. 
The first and second non-naturally peptide-bonded, biologically active domains may be connected to the linking se- 
quence such that the linking sequence is peptide-bonded at its N-terminus to the first biologically active domain and 
at its C-terminus to the second biologically active domain. The linking sequence may further comprise plural consecutive 
copies of an amino acid sequence, for example the amino acid sequence (GlyGlyGlyGlySer) 3 and may further comprise 

10 one or a pair of amino acid sequences recognizable by a site-specific cleavage agent. 

[0010] According to another aspect of the invention there is provided a polypeptide linker. The polypeptide linker has 
a length of at least 10 amino acid residues and links two non-naturally linked polypeptide domains to form a multifunc- 
tional protein. The linker exhibits amino acids with small and unreactive side chains and comprises plural hydrophilic 
peptide-bonded amino acids constituting a hydrophilic sequence. The linker spans the distance between the C-terminal 

15 end of a first domain and the N-terminal end of a second domain. Each domain comprises a biologically active polypep- 
tide having a conformation suitable for biological activity independent of the biological activity of the other domain. 
[001 1] The polypeptide linker described in the above paragraph may, independently, comprise threonine, be cysteine- 
f ree, comprise a plurality of glycine or serine residues, comprise plural consecutive copies of an amino acid sequence, 
span a distance of at least 4 nm (40 Angstroms), comprise the amino acid sequence (GlyGlyGlyGlySer) 3 , or comprise 

20 one amino acid sequence or a pair of amino acid sequences recognizable by a site-specific cleavage agent. At least 
one of the polypeptide domains linked by the polypeptide linker described in the above paragraph may comprise an 
enzyme, a toxin, a receptor, a binding site, a biosynthetic antibody binding site, a growth factor, a cell-differentiation 
factor, a lymphokine, a cytokine, a hormone, a remotely detectable moiety or an anti-metabolite. The first domain linked 
to the polypeptide linker described in the above paragraph may comprise a single chain binding site, and the second 

25 domain linked to the polypeptide linker described in the above paragraph may comprise an enzyme, a toxin, a receptor, 
a binding site, a biosynthetic antibody binding site, a growth factor, a cell-differentiation factor, a lymphokine, a cytokine, 
a hormone or an anti-metabolite. At least one of the domains linked to the linker described in the previous paragraph 
may comprise a polypeptide capable of sequestering an ion, such as preferably calmodulin, methallothionein, a frag- 
ment thereof, or an amino acid sequence rich in at least one of glutamic acid, aspartic acid, lysine and arginine. The 

30 amino acids of the linker described in the previous paragraph may assume an unstructured polypeptide configuration 
in aqueous solution. 

[001 2] According to another aspect of the invention there is provided a polypeptide linker. The polypeptide linker has 
a length of at least 10 amino acid residues and links two non-naturally linked polypeptide domains such to form a 
functional protein. The linker exhibits amino acids with small and unreactive side chains and comprises plural hy- 
35 drophilic peptide-bonded amino acids constituting a hydrophilic sequence. The linker spans the distance between the 
C-terminal end of a first domain and the N-terminal end of a second domain. Together, the domains comprise an 
immunologically reactive binding site for a preselected antigen. 

[001 3] The two polypeptide domains linked by the linker described in the previous paragraph may be of such a nature 
as to mimic a V H and V L chain from a natural immunoglobulin. The polypeptide linker described in the above paragraph 

40 may, independently, comprise threonine, be cysteine-free, comprise a plurality of glycine or serine residues, comprise 
plural consecutive copies of an amino acid sequence, span a distance of at least 4 nm (40 Angstroms), comprise the 
amino acid sequence (GlyGlyGlyGlySer) 3 , or comprise one amino acid sequence or a pair of amino acid sequences 
recognizable by a site-specific cleavage agent. The amino acids of the linker described in the previous paragraph may 
assume an unstructured polypeptide configuration in aqueous solution. 

45 [001 4] Further aspects of this invention provide DNA encoding the polypeptide chain described above, DNA encoding 
the respective polypeptide linkers described above and a host cell transformed with and capable of expressing the 
DNA encoding the respective polypeptide linkers described above. 

[0015] Note: The instant application is a divisional application of EP 88905298.1 , now granted as EP 0 318 554. As 
much of the description of the parent application is necessary to understand the subject matter claimed in the instant 
so application in its proper context, large portions of the parent description were left intact in adapting the instant description 
to the allowed claims. It should not, however, be inferred that subject matter contained in the instant description and 
covered by claims of EP 0 318 554 is included in the subject matter instantly claimed. 

[0016] As used herein, the phrase biosynthetic antibody binding site or BABS means synthetic proteins expressed 
from DNA derived by recombinant techniques. BABS comprise biosynthetically produced sequences of amino acids 
55 defining polypeptides designed to bind with a preselected antigenic material. The definition of BABs according to the 
instant divisional application does not include the polypeptide constructs as* claimed herein. The structure of these 
synthetic polypeptides is unlike that of naturally occurring antibodies, fragments thereof, e.g., Fv, or known synthetic 
polypeptides or "chimeric antibodies" in that the regions of the BABS responsible for specificity and affinity of binding, 
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(analoqous to native antibody variable regions) are linked by peptide bonds, expressed from a single DMA. and may 
S todS* e g may comprise amino acid sequences homologous to portions of at least two different 
22 ^^iufer^ BABS are biosynthetic in the sense that they are synthesized in a cellular host made to 
ex^sl rsySc DNA that is. a recombinant DNA made by ligation of plural, chemically synthesized olgonucle- 
o des "or £ Cat on offragments of DNA derived from the genome of a hybridoma, mature B ce clone, or a cDNA 
nS'derL from such natural sources. The single polypeptide chain and linker-containing mu«trfun«,ona ^feins 
of the invention are properly characterized as "binding sites" in that these synthetic molecules are designed to have 
specmc affinity for a preselected antigenic determinant. The polypeptides of the invention comprise structures wh.ch 
can be patterned after regions of native antibodies known to be responsible for antigen recognition. 

Brief Description of the Drawing 

r0017l The foregoing and other objects of this invention, the various features thereof, as well as the invention itself, 
mav be more u fy understood from the following description, when read together with the accompanying drawings 
mm Rgu e a"s a schematic representation of an intact IgG antibody molecule containing two light chains each 
Sin 5 one variable and one constant domain, and two heavy chains, ^"^«£^££Z 
constant domains Figure 1 B is a schematic drawing of the structure of Fv proteins (and DNA encoding them) .I usttat ng 
Tend V domains, each of which comprises four framework (FR) regions and three complementer.* deterring 
IcDR) ^regions Boundaries of CDRs are indicated, by way of example, for monoclonal 26-10, a well known and char- 

understanding. Sequence 1 is the known native sequence of V H from murine monoclonal g p-4 ( ant ^ s ^ me ^e 
25 nuence 2 is the known native sequence of V H from murine monoclonal 26-10 (anti-digoxin). Sequence 3 is a BABS 
25 quence 2 is the , known ' na " ve se ^ CD £ s , rom g | p . 4 V H . The CDRs are identified in lowercase letters; restriction 

o I from h .man mveloma antibody NEWM. Sequence 5 is a BABS comprising the FRs from NEWM V H and the 
SsZg^^ 

30 [S 4 t 4F are the synthetic nucleic acid sequences and encoded amino acid sequences of (4A) the heavy 

haT laS domain of murJanti-digoxin monoclonal 26-10; (4B) the light ^.^^^^ 
Loxin monoclonal 26-1 0- (4C) a heavy chain variable domain of a BABS comprising CDRs of glp-4 and FRs of 26 10. 
X" ^ vSabfetgion of Jsame BABS; (4E) a heavy chain variable region of a BABS > CDRs 
of niD 4 and FRs of NEWM; and (4F) a light chain variable region comprising CDRs of glp-4 and FRs ot Ncwm. 

CDRs and restriction sites for endonuclease digestion, most of which were introduced during 

S ^igureTis the nucleic acid and encoded amino acid sequence of a host DNA ^^P^^ 
Zt« on of CDRs of choice The DNA was designed to have unique 6-base sites directly flanking the CDRs so hat 
S22^"XS^d-^ portions of CDRs can be readily inserted, and to have other ***** 
nX^ZotZ DNA to optimize binding properties in a given construct. The framework regions of the molecule 

=rng^^^ ^ ^ ° NA enC ° din9 them) C ° mPri f ? 3 f fesed 

S e specify of murine monoclonal 26-1 0, linked through a spacer to the FB fragment of protein A he re fused 

Ts a leader and constituting a binding site for Fc. The spacer comprises the 11 C-terminal amino acds of the FB 
flowed by Asp .Pro (a dilute acid cleavage site). The single chain BABS comprises sequences mimicking the V and 
\ «J5 and fh V L and V H (6B) of murine monoclonal 26-10. The VJn construct 6A is altered at residue 4 where vahne 
^^^A in the parent 26-10 sequence. These constructs contain binding srtes for both Fc and 
digoxin. Their structure may be summarized as; 
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(6A) FB-Asp-Pro-V H -(Gly 4 -Ser) 3 -V L , 

and 

(6B) FB-Asp-Pro-V L -(Gly 4 -Ser) 3 -V H , 
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where (Gly 4 -Ser) 3 is a polypeptide linker. 

[0024] In Figures 4A-4E and 6A and 6B, the amino acid sequence of the expression products start after the GAATTC 
sequences, which codes for an EcoRI splice site, translated as Glu-Phe on the drawings. 

[0025] Figure 7A is a graph of percent of maximum counts bound of radioiodinated digoxin versus concentration of 
5 binding protein adsorbed to the plate comparing the binding of native 26-10 (curve 1) and the construct of Figure 6A 

and Figure 2B renatured using two different procedures (curves 2 and 3). Figure 7B is a graph demonstrating the 

bifunctionality of the FB-(26-10) BABS adhered to microtiter plates through the specific binding of the binding site to 

the digoxin-BSA coat on the plate. Figure 7B shows the percent inhibition of 125 l-rabbit-lgG binding to the FB domain 

of the FB BABS by the addition of IgG, protein A, FB, murine lgG2a, and murine IgGI. 
10 [0026] Figure 8 is a schematic representation of a model assembled DNA sequence encoding a multifunctional 

biosynthetic protein comprising a leader peptide (used to aid expression and thereafter cleaved), a binding site, a 

spacer, and an effector molecule attached as a trailer sequence. 

[0027] Figure 9A-9E are exemplary synthetic nucleic acid sequences and corresponding encoded amino acid se- 
quences of binding sites of different specificities: (A) FRs from NEWM and CDRs from 26-10 having the digoxin spe- 
15 cificity of murine monoclonal 26-10; (B) FRs from 26-10, and CDRs from G-loop-4 (glp-4) having lysozyme specificity; 
(C) FRs and CDRs from MOPC-315 having dinitrophenol (DNF) specificity; (D) FRs and CDRs from an anti-CEA 
monoclonal antibody; (E) FRs in both V H and V L and CDR, and CDR 3 in V H , and CDR 1f CDR 2 , and CDR 3 in V L from 
an anti-CEA monoclonal antibody; CDR 2 in V H is a CDR 2 consensus sequence found in most immunoglobulin V H 
regions. 

20 [0028] Figure 1 0A is a schematic representation of the DNA and amino acid sequence of a leader peptide (MLE) 
protein with corresponding DNA sequence and some major restriction sites. Figure 10B shows the design of an ex- 
pression plasmid used to express MLE-BABS (26-10). During construction of the gene, fusion partners were joined at 
the EcoRI site that is shown as part of the leader sequence. The pBR322 plasmid, opened at the unique Sspl and Pstl 
sites, was combined in a 3-part ligation with an Sspl to EcoRI fragment bearing the trj) promoter and MLE leader and 

25 with an EcoR I to Pstl fragment carrying the BABS gene. The resulting expression vector confers tetracycline resistance 
on positive transformants. 

[0029] Figure 11 is an SDS-poIyacryl amide gel (15%) of the (26-1 0) BABS at progressive stages of purification. Lane 
0 shows low molecular weight standards; lane 1 is the MLE-BABS fusion protein; lane 2 is an acid digest of this material; 
lane 3 is the pooled DE-52 chromatographed protein; lanes 4 and 5 are the same oubain-Sepharose pool of single 
30 chain BABS except that lane 4 protein is reduced and lane 5 protein is unreduced. 

[0030] Figure 12 shows inhibition curves for 26-1 0 BABS and 26-1 0 Fab species, and indicates the relative,affinities 
of the antibody fragment for the indicated cardiac glycosides. 

[0031] Figures 13A and 13B are plots of digoxin binding curves. (A) shows 26-10 BABS binding isotherm and Sips 
plot (inset), and (B) shows 26-10 Fab binding isotherm and Sips plot (inset). 
35 [0032] Figure 1 4 is a nucleic acid sequence and corresponding amino acid sequence of a modified FB dimer leader 
sequence and various restriction sites. 

[0033] Figure 15A-15H are nucleic acid sequences and corresponding amino acid sequences of biosynthetic multi- 
functional proteins including a single chain BABS and various biologically active protein trailers linked via a spacer 
sequence. Also indicated are various endonuclease digestion sites. The trailing sequences are (A) epidermal growth 
40 factor (EGF); (B) streptavidin; (C) tumor necrosis factor (TNF); (D) calmodulin; (E) platelet derived growth factor-beta 
(PDGF-beta); (F) ricin; and (G) interleukin-2, and (H) an FB-FB dimer. 

Description 

45 [0034] The invention will first be described in its broadest overall aspects with a more detailed description following. 
[0035] A class of novel biosynthetic, bi or multifunctional proteins has now been designed and engineered which 
comprise biosynthetic antibody binding sites, that is, "BABS" or biosynthetic polypeptides defining structure capable 
of selective antigen recognition and preferential antigen binding, and one or more peptide-bonded additional protein 
or polypeptide regions designed to have a preselected property. Examples of the second region include amino acid 

50 sequences designed to sequester ions, which makes the protein suitable for use as an imaging agent, and sequences 
designed to facilitate immobilization of the protein for use in affinity chromatography and solid phase immunoassay. 
Another example of the second region is a bioactive effector molecule, that is, a protein having a conformation suitable 
for biological activity, such as an enzyme, toxin, receptor, binding site, growth factor, cell differentiation factor, lym- 
phokine, cytokine, hormone, or anti-metabolite. This invention features synthetic, multifunctional proteins comprising 

55 these . regions peptide bonded to one or more biosynthetic antibody binding sites, synthetic, single chain proteins 
designed to bind preselected antigenic determinants with high affinity and specificity, constructs containing multiple 
binding sites linked together to provide multipoint antigen binding and high net affinity and specificity, DNA encoding 
these proteins prepared by recombinant techniques, host cells harboring these DNAs, and methods for the production 
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S£T rhSTobeTnlcovered that biosynthetic domains mimicking the structure of the two chains , o, ar , immu- 
noglobulin binding site may be connected by a polypeptide linker while closely approaching, retaining, and often im- 
proving their collective binding properties. „ ra « ara hi« two domains 
[0037] The binding site region of the multifunctional proteins comprises at least one, and preferably two domams, 
each of which has an amino acid sequence homologous to portions of the CDRs of the variable doma, o w immu- 
noalobuJ i iaht or heavy chain, and other sequence homologous to the FRs of the variable domain of the same, or a 

linkina the domains Polypeptides so constructed bind a specific preselected ant.gen determined by the CDRs held in 

Z byX.FR. and the linker. Preferred structures have human FRs Le —the = 

ouence of at least a portion of the framework regions of a human immunoglobulin, and have linked domains which 
og Z<£Z structure mimicking a V H -V L or V L -V H immunog.obulin two-chain ^f!^^^ 
mammalian immunoglobulin, such as those of mouse, rat, or human origin are preferred. The biosynthetic antibody 
SnZ srt cZS FRs homologous with a portion of the FRs of a human immunoglobulin and CDRs homologous 
S^^ZTZuTor rat immunog.obu.in. This type of chimeric polypeptide displays the antigen hnftn^ 
S ijofth mouseorratimmunog.obur,n,whi.eit S humanframeworkminimizeshuman,m^ 
the chimeric polypeptide may comprise other amino acid sequences. It may compr.se, for example, a sequence ho 
mologous to JjKSn of the constant domain of an immunoglobu.in, but preferably is free of constant regions (other 

loO^rThe binding site region(s) of the chimeric proteins are thus single chain composite polypeptides comprising 
stu« 

has a structure patterned after tandem V H and V L domains, but with the carboxyl term.nal of one attached through a 
S«tS^Sw^ to the aminoterminal of the other. The linking amino acid sequence may or may no itseK 
be Tntiaelc or biologically active. It preferably spans a distance of at least about 4 nm (40A), i.e., comprises at least 
lloT^Zo^Sa" d comprises residues which together present a hydrophilic, relatively unstructured region^ 
Unki g trno acid sequences having .itt.e or no secondary structure work well. Optionally one or a pair of ^nuque 
amino acl or amino acid sequences recognizable by a site specific cleavage agent may be included ,n the I nker 
Tte permS the V H and V L -.ike domains to be separated after expression, or the linker to be excised after refo.ding of 

!SoMl di E 9 ither the amino or carboxyl terminal ends (or both ends) of these chimeric, single chain binding sites are 
attached to an 2 no aS sequence which itself is bioactive or has some other function to produce a Muncbon. J or 
3SS pLin. For example, the synthetic binding site may include a .eader and/or trailer sequence defin ng 
Toe leotide having enzymatic activity, independent affinity for an antigen different from the antigen to which he 
SSS5 directed or having other functions such as to provide a convenient site of attachment for a radioactive 
^to^TSdue designed to link chemica.ly to a solid support. This fused, independently function* section 
of pro ^ should be distinguished from fused leaders used simply to enhance expression in P r ° ka ^ c ^ 
yeaS The muttifunctionaT proteins also should be distinguished from the "conjugates" d.sclosed ,n the prior art com- 
nrkinn antibodies which after expression, are linked chemically to a second moiety. 

ESS ol se 22 of amino acids designed as a "spacer" is interposed between the active regions of he mu t, 
uncSnal protein Use of such a spacer can promote independent refolding of the reg.ons of the protein. The spacer 
Is may Mu! a specific sequence of amino acids recognized by an endopeptidase, for example, 

CeTcel (e g one having a surface protein recognized by the binding site) so that the bioactive effector prate n s 

Sedand^ 

exhibit less of a tendency to interfere with the binding behavior of the BABS. . _ 

[0041 The therapeutic use of such "self-targeted" bioactive proteins offers a number of advanta 9 es ^ 
of immunoglobulin fragments or complete antibody molecules: they are stable, less immunogenic an have a owe 
motecula weight- they can penetrate body tissues more rapidly for purposes of imaging or drug delivery because of 
Ssma le S and they can facilitate accelerated clearance of targeted isotopes or drugs. Furthermore, because 
d e a n oS 

an essentiaHy Hmrtless combination of binding sites and bioactive proteins is possible, each of which can be refined 
as dEuDsed herein to optimize independent activity at each region of the synthetic protein. The synthetic proteins can 

such a * and thus are less costly 10 produce tha " immun09,obu,ins or fragments 

thereof which require expression in cultured animal cell lines. „ # 

00421 The invention thus provides a family of recombinant proteins expressed from a single piece of DNA al of 
whTcLve the capacity to bind specifically with a predetermined antigenic determinant. The preferred specie of he 
proteins "comprise a second domain which functions independently of the binding region. In this aspect the invention 
o oSdes a of "self-targeted" proteins which have a bioactive function and which deliver that function to a locus 
oLtormTned by the binding site's specificity. It also provides biosynthetic binding proteins having attached polypeptides 
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suitable for attachment to immobilization matrices which may be used in affinity chromatography and solid phase 
immunoassay applications, or suitable for attachment to ions, e.g., radioactive ions, which may be used for in vivo 
imaging. 

[0043] The successful design and manufacture of the proteins of the invention depends on the ability to produce 
s biosynthetic binding sites, and most preferably, sites comprising two domains mimicking the variable domains of im- 
munoglobulin connected by a linker. 

[0044] As is now well known, Fv, the minimum antibody fragment which contains a complete antigen recognition and 
binding site, consists of a dimer of one heavy and one light chain variable domain in noncovalent association (Figure 
1A). It is in this configuration that the three complementarity determining regions of each variable domain interact to 

10 define an antigen binding site on the surface of the V H -V L dimer. Collectively, the six complementarity determining 
regions (see Figure 1 B) confer antigen binding specificity to the antibody. FRs flanking the CDRs have a tertiary struc- 
ture which is essentially conserved in native immunoglobulins of species as diverse as human and mouse. These FRs 
serve to hold the CDRs in their appropriate orientation. The constant domains are not required for binding function, 
but may aid in stabilizing V H -V L interaction. Even a single variable domain (or half of an Fv comprising only three CDRs 

15 specific for an antigen) has the ability to recognize and bind antigen, although at a lower affinity than an entire binding 
site (Painter et al. (1972) Biochem. 1J_: 1327- 1337). 

[0045] This knowledge of the structure of immunoglobulin proteins has now been exploited to develop multifunctional 

fusion proteins comprising biosynthetic antibody binding sites and one or more other domains. 

[0046] The structure of these biosynthetic proteins in the region which impart the binding properties to the protein is 

20 analogous to the Fv region of a natural antibody. It comprises at least one, and preferably two domains consisting of 
amino acids defining V H and V L -like polypeptide segments connected by a linker which together form the tertiary mo- 
lecular structure responsible for affinity and specificity. Each domain comprises a set of amino acid sequences analo- 
gous to immunoglobulin CDRs held in appropriate conformation by a set of sequences analogous to the framework 
regions (FRs) of an Fv fragment of a natural antibody. 

25 [0047] The term CDR, as used herein, refers to amino acid sequences which together define the binding affinity and 
specificity of the natural Fv region of a native immunoglobulin binding site, or a synthetic polypeptide which mimics 
this function. CDRs typically are not wholly homologous to hypervariable regions of natural Fvs, but rather also may 
include specific amino acids or amino acid sequences which flank the hypervariable region and have heretofore been 
considered framework not directly determinitive of complementarity. The term FR, as used herein, refers to amino acid 

30 sequences flanking or interposed between CDRs. 

[0048] The CDR and FR polypeptide segments are designed based on sequence analysis of the Fv region of pre- 
existing antibodies or of the DNA encoding them. In one embodiment, the amino acid sequences constituting the FR 
regions of the BABS are analogous to the FR sequences of a first preexisting antibody, for example, a human IgG. 
The amino acid sequences constituting the CDR regions are analogous to the sequences from a second, different 

35 . preexisting antibody, for example, the CDRs of a murine IgG. Alternatively, the CDRs and FRs from a single preexisting 
antibody from, e.g., an unstable or hard to culture hybridoma, may be copied in their entirety. 
[0049] The design and biosynthesis of various reagents is enabled, all of which are characterized by a region having 
affinity for a preselected antigenic determinant. The binding site and other regions of the biosynthetic protein are de- 
signed with the particular planned utility of the protein in mind. Thus, if the reagent is designed for intravascular use 

40 in mammals, the FR regions may comprise amino acids similar or identical to at least a portion of the framework region 
amino acids of antibodies native to that mammalian species. On the other hand, the amino acids comprising the CDRs 
may be analogous to a portion of the amino acids from the hypervariable region (and certain flanking amino acids) of 
an antibody having a known affinity and specificity, e.g., a murine or rat monoclonal antibody. 
[0050] Other sections of native immunoglobulin protein structure, e.g., C H and C L , need not be present and normally 

45 are intentionally omitted from the biosynthetic proteins. However, the proteins of the invention normally comprise ad- 
ditional polypeptide or protein regions defining a bioactive region, e.g., a toxin or enzyme, or a site onto which a toxin 
or a remotely detectable substance can be attached. 

[0051] The invention thus can provide intact biosynthetic antibody binding sites analogous to V H -V L dimers, either 
non-covalently associated, disulfide bonded, or preferably linked by a polypeptide sequence to form a composite V H - 

so v L or V L -V H polypeptide which may be essentially free of antibody constant region. The invention also provides proteins 
analogous to an independent V H or V L domain, or dimers thereof. Any of these proteins may be provided in a form 
linked to, for example, amino acids analogous or homologous to a bioactive molecule such as a hormone or toxin. 
[0052] Connecting the independently functional regions of the protein is a spacer comprising a short amino acid 
sequence whose function is to separate the functional regions so that they can independently assume their active 

55 tertiary conformation. The spacer can consist of an amino acid sequence present on the end of a functional protein 
which sequence is not itself required for its function, and/or specific sequences engineered into the protein at the DNA 
level. 

[0053] The spacer generally may comprise between 5 and 25 residues. Its optimal length may be determined using 
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least a portion of which defines a binding site patterned after the variable region of an immunoglobulin. It will be apparent 
that the nature of any protein fragments linked to the BABS, and used for reagents embodying the invention, are 
essentially unlimited, the essence of the invention being the provision, either alone or linked to other proteins, of binding 
sites having specificities to any antigen desired. 

5 [0061 ] The clinical administration of multifunctional proteins comprising a BABS, or a BABS alone, affords a number 
of advantages over the use of intact natural or chimeric antibody molecules, fragments thereof, and conjugates com- 
prising such antibodies linked chemically to a second bioactive moiety. The multifunctional proteins described herein 
offer fewer cleavage sites to circulating proteolytic enzymes, their functional domains are connected by peptide bonds 
to polypeptide linker or spacer sequences, and thus the proteins have improved stability. Because of their smaller size 

10 and efficient design, the multifunctional proteins described herein reach their target tissue more rapidly, and are cleared 
more quickly from the body. They also have reduced immunogenicity. In addition, their design facilitates coupling to 
other moieties in drug targeting and imaging application. Such coupling may be conducted chemically after expression 
of the BABS to a site of attachment for the coupling product engineered into the protein at the DN A level. Active effector 
proteins having toxic, enzymatic, binding, modulating, cell differentiating, hormonal, or other bioactivity are expressed 

15 from a single DNA as a leader and/or trailer sequence, peptide bonded to the BABS. 

Design and Manufacture 

[0062] The single polypeptide chain and linkers of the invention are designed at the DNA level. The chimeric or 
20 synthetic DNAs are then expressed in a suitable host system, and the expressed proteins are collected and renatured 
if necessary. A preferred general structure of the DNA encoding the proteins is set forth in Figure 8. As illustrated, it 
encodes an optimal leader sequence used to promote expression in procaryotes having a built-in cleavage site rec- 
ognizable by a site specific cleavage agent, for example, an endopeptidase, used to remove the leader after expression. 
This is followed by DNA encoding a V H -like domain, comprising CDRs and FRs, a linker, a V L -like domain, again 
25 comprising CDRs and FRs, a spacer, and an effector protein. After expression, folding, and cleavage of the leader, a 
bifunctional protein is produced having a binding region whose specificity is determined by the CDRs, and a peptide- 
linked independently functional effector region. 

[0063] The ability to design the BABS of the invention depends on the ability to determine the sequence of the amino 
acids in the variable region of monoclonal antibodies of interest, or the DNA encoding them. Hybridoma technology 

30 enables production of cell lines secreting antibody to essentially any desired substance that produces an immune 
response. RNA encoding the light and heavy chains of the immunoglobulin can then be obtained from the cytoplasm 
of the hybridoma. The 5' end portion of the mRNA can be used to prepare cDNA for subsequent sequencing, or the 
amino acid sequence of the hypervariable and flanking framework regions can be determined by amino acid sequencing 
of the V region fragments of the H and L chains. Such sequence analysis is now conducted routinely. This knowledge, 

35 coupled with observations and deductions of the generalized structure of immunoglobulin Fvs, permits one to design 
synthetic genes encoding FR and CDR sequences which likely will bind the antigen. These synthetic genes are then 
prepared using known techniques, or using the technique disclosed below, inserted into a suitable host, and expressed, 
and the expressed protein is purified. Depending on the host cell, renaturation techniques may be required to attain 
proper conformation. The various proteins are then tested for binding ability, and one having appropriate affinity is 

40 selected for incorporation into a reagent of the type described above. If necessary, point substitutions seeking to op- 
timize binding may be made in the DNA using conventional casette mutagenesis or other protein engineering meth- 
odology such as is disclosed below. 

[0064] Preparation of the proteins of the invention also is dependent on knowledge of the amino acid sequence (or 
corresponding DNA or RNA sequence) of bioactive proteins such as enzymes, toxins, growth factors, cell differentiation 
45 factors, receptors, anti-metabolites, hormones or various cytokines or lymphokines. Such sequences are reported in 
the literature and available through computerized data banks. 

[0065] The DNA sequences of the binding site and the second protein domain are fused using conventional tech- 
niques, or assembled from synthesized oligonucleotides, and then expressed using equally conventional techniques. 
[0066] The processes for manipulating, amplifying, and recombining DNA which encode amino acid sequences of 
so interest are generally well known in the art, and therefore, not described in detail herein. Methods of identifying and 
isolating genes encoding antibodies of interest are well understood, and described in the patent and other literature. 
In general, the methods involve selecting genetic material coding for amino acids which define the proteins of interest, 
including the CDRs and FRs of interest, according to the genetic code. 

[0067] Accordingly, the construction of DNAs encoding proteins as disclosed herein can be done using known tech- 
55 niques involving the use of various restriction enzymes which make sequence specific cuts in DNA to produce blunt 
ends or cohesive ends, DNA ligases, techniques enabling enzymatic addition of sticky ends to blunt-ended DNA, 
construction of synthetic DNAs by assembly of short or medium length oligonucleotides, cDNA synthesis techniques, . 
and synthetic probes for isolating immunoglobulin or other bioactive protein genes. Various promoter sequences and 



9 



EP 0 623 679 B1 



10 



15 



20 



25 



30 



35 



*. regraatory DNA •« ^-^1=5153 ^SS«™a"^S^ 

cleotides produced in a conventional, automated po synthesized semi manualiy 

es.Forexample.overlapping.complementa^ 

using phosphoramidite chemistry, with end segments left ""f^JJ^J S action of a particular restriction 
One end of the synthetic DNA is left with a "sticky end '^^^ endonu- 
endonuclease, and the other end is left wit ^^^^^^^^^ be created by synthe- 

l FR S , and intervening "dummy" CDRs or ^^J^^^,^ action is employed 

approach facilitates empirical ^^^^to.^^ ' 
[0070] Thistechniqueisdependentupontheab 1 l,tytocleavea 

gene at specific sites flanking nucleotide sequences encod '^.^J^JZSd , «* the nucleotide sequence 
found in the native gene. Alternatively, non-nat,ve encoding the same 

resulting in a synthetic gene with a different sequence < of nucleo "^^ w ^endonuclease 
variable region amino acids because of the degenera^^^ 

digestion, and comprising FR-encod.ng sequences are then I ga tad t non natrv o 9 ^ ^ 

transfection with an appropriate vector. In E c* and ^r^^^^^^ihed by the transfection of 
fusion protein which is subsequently cleaved. Ex P ress ' on " acids deling a second function into a 

DNA sequences encoding CDR and I R St^SST JLfl hybrid Fv regions and 
myeloma or other type of cell line. By this stra tegy «w JW.a » f . tein expre ssed in bac- 

SSL « me Ik. ano .r,bee,o.n»y ^.^Z^SS^<^ » • «™— 

5 m » «e.,ed * the ^^SS'tTZ^L- «* reeior»e (a- » 

[0074) The hinge ,egta, een take m.ny -.reel forma «. *W "J^tajj^ 2 >ag , eite ee appropriate 
DNA tragment eneodieg there) «!** intpert to the regron of lire M prWjn eboot me e » ^ 
polarity, charge distribution, and stereochemistry «hr* dr ^^^^S^Zmi- eitie thai mey be 
eteentiy exposes the el.eyeg. eite to the ^ '^SS^ C aia the amino abide 



50 



55 



10 



EP 0 623 679 B1 



glutamic acid, arginine, lysine, serine, and threonine residues maximize ionic interactions and may be present in 
amounts and/or in sequence which renders the moiety, comprising the hinge water soluble. 

[0076] The cleavage site preferably is immediately adjacent the Fv polypeptide chains and comprises one amino 
acid or a sequence of amino acids exclusive of any sequence found in the amino acid structure of the chains in the 

5 Fv. The cleavage site preferably is designed for unique or preferential cleavage by a specific selected agent. Endopepti- 
dases are preferred, although non-enzymatic (chemical) cleavage agents may be used. Many useful cleavage agents, 
for instance, cyanogen bromide, dilute acid, trypsin, Staphylococcus aureus V-8 protease, post proline cleaving en- 
zyme, blood coagulation Factor Xa, enterokinase, and renin, recognize and preferentially or exclusively cleave partic- 
ular cleavage sites. One currently preferred cleavage agent is V-8 protease. The currently preferred cleavage site is 

10 a Glu residue. Other useful enzymes recognize multiple residues as a cleavage site, e.g., factor Xa (lle-Glu-Gly-Arg) 
or enterokinase ( Asp-Asp-Asp- Asp-Lys). The principles of this selective cleavage approach may also be used in the 
design of the linker and spacer sequences of the multifunctional constructs of the invention where an exciseable linker 
or selectively cleavable linker or spacer is desired. 

15 Design of Synthetic V M and V| Mimics 

[0077] FRs from the heavy and light chain murine anti-digoxin monoclonal 26-10 (Figures 4A and 4B) were encoded 
on the same DNAs with CDRs from the murine anti-lysozyme monoclonal glp-4 heavy chain (Figure 3 sequence 1) 
and light chain to produce V H (Figure 4C) and V L (Figure 4D) regions together defining a biosynthetic antibody binding 
20 site which is specific for lysozyme. Murine CDRs from both the heavy and light chains of monoclonal glp-4 were encoded 
on the same DNAs with FRs from the heavy and light chains of human myeloma antibody NEWM (Figures 4E and 4F). 
The resulting interspecies chimeric antibody binding domain has reduced immunogenicity in humans because of its 
human FRs, and specificity for lysozyme because of its murine CDRs. 

[0078] A synthetic DNA was designed to facilitate CDR insertions into a human heavy chain FR and to facilitate 
25 empirical refinement of the resulting chimeric amino acid sequence. This DNA is depicted in Figure 5. 

[0079] A synthetic, bifunctional FB-binding site protein was also designed at the DNA level, expressed, purified, 
renatured, and shown to bind specifically with a preselected antigen (digoxin) and Fc. The detailed primary structure 
of this construct is shown in Figure 6; its tertiary structure is illustrated schematically in Figure 2B. 
[0080] Details of these and other experiments, and additional design principles on which the invention is based, are 
30 set forth below. 

GENE DESIGN AND EXPRESSION 

[0081] Given known variable region DNA sequences, synthetic V L 'and V H genes may be designed which encode 

35 native or near native FR and CDR amino acid sequences from an antibody molecule, each separated by unique re- 
striction sites located as close to FR-CDR and CDR-FR borders as possible. Alternatively, genes may be designed 
which encode native FR sequences which are similar or identical to the FRs of an antibody molecule from a selected 
species, each separated by "dummy" CDR sequences containing strategically located restriction sites. These DNAs 
serve as starting materials for producing BABS, as the native or "dummy" CDR sequences may be excised and replaced 

40 with sequences encoding the CDR amino acids defining a selected binding site. Alternatively, one may design and 
directly synthesize native or near-native FR sequences from a first antibody molecule, and CDR sequences from a 
second antibody molecule. Any one of the V H and V L sequences described above may be linked together directly, via 
an amino acids chain or linker connecting the C-terminus of one chain with the N-terminus of the other. 
[0082] These genes, once synthesized, may be cloned with or. without additional DNA sequences coding for, e.g., 

45 an antibody constant region, enzyme, or toxin, or a leader peptide which facilitates secretion or intracellular stability 
of a fusion polypeptide. The genes then can be expressed directly in an appropriate host cell, or can be further engi- 
neered before expression by the exchange of FR, CDR, or "dummy" CDR sequences with new sequences. This . 
manipulation is facilitated by the presence of the restriction sites which have been engineered into the gene at the 
FR-CDR and CDR-FR borders. 

so [0083] Figure 3 illustrates the general approach to designing a chimeric V H ; further details of exemplary designs at 
the DNA level are shown in Figures 4A-4F. Figure 3, lines 1 and 2, show the amino acid sequences of the heavy chain 
variable region of the murine monoclonals glp-4 (anti-lysozyme) and 26-10 (anti-digoxin), including the four FR and 
three CDR sequences of each. Line 3 shows the sequence of a chimeric V H which comprises 26-10 FRs and glp-4 
CDRs. As illustrated, the hybrid protein of line 3 is identical to the native protein of line 2, except that 1) the sequence 

55 TFTNYYIHWLK has replaced the sequence IFTDFYMNWVR, 2) EWIGWIYPGNGNTKYNENFKG has replaced DYI- 
GYISPYSGVTGYNQKFKG, 3) RYTHYYF has replaced GSSGNKWAM, and 4) A has replaced V as the sixth amino 
acid beyond CDR-2. These changes have the effect of changing the specificity of the 26-1 0 V H to mimic the specificity 
of glp-4. The Ala to Val single amino acid replacement within the relatively conserved framework region of 26-1 0 is an 



11 



EP 0 623 679 B1 

S B^rt «"y »n«body. o, obtaining .he sequence from the litetalu.e, in view of thte d'scltjo.e one 
E ,„ STZZ a BABS of any deaited epecifiedy eotnpeising any deei.ed Iramewo-k WaJ*« 

r^JKC^STe-t-----------' 

S,„ C T £"s r sk , nea i ,id of a commercially available computet ptoaram wtmlt performs combined 

fr CDR boundarv is not an absolute one, and because the amino acid sequence of the FR may not permit a 

code) of a synthetic DNA comprising.a master framework gene having the genenc structure: 

R, -FR, -X, -FR 2 -X 2 -FR3-X 3 -FR 4 -R2 

Clal for CDR 3 . 

ni ittnKllir.l FOTIDE SYNTHESIS 

fOOSOl The synthetic genes and DNA fragments designed as described above preferably are produced by assembly 
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TheDN^^ 

and ligated into larger blocks which may also be purified by PAGE. 
CLONING OF SYNTHETIC OL IGONUCLEOTIDES 

[0091] The blocks or the pairs of longer oligonucleotides may be cloned into E. coli using a suitable, e.g.. pUC, 
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cloning vector. Initially, this vector may be altered by single strand mutagenesis to eliminate residual six base altered 
sites. For example, V H may be synthesized and cloned into pUC as five primary blocks spanning the following restriction 
sites: 1. EcoRI to first Narl site; 2. first Narl to Xbal; 3. Xbal to Sail; 4. Sail to Ncol; 5. Ncol to BamHI. These cloned 
fragments may then be isolated and assembled in several three-fragment ligations and cloning steps into the pUC8 
5 plasmid. Desired ligations selected by PAGE are then transformed into, for example, E. coli strain JM83, and plated 
onto LB Ampicillin + Xgal plates according to standard procedures. The gene sequence may be confirmed by supercoil 
sequencing after cloning, or after subcloning into M13 via the dideoxy method of Sanger. 

PRINCIPLE OF CDR EXCHANGE 

10 

[0092] Three CDRs (or alternatively, four FRs) can be replaced per V H or V L . In simple cases, this can be accom- 
plished by cutting the shuttle pUC plasmid containing the respective genes at the two unique restriction sites flanking 
each CDR or FR, removing the excised sequence, and ligating the vector with a native nucleic acid sequence or a 
synthetic oligonucleotide encoding the desired CDR or FR. This three part procedure would have to be repeated three 
15 times for total CDR replacement and four times for total FR replacement. Alternatively, a synthetic nucleotide encoding 
two consecutive CDRs separated by the appropriate FR can be ligated to a pUC or other plasmid containing a gene 
whose corresponding CDRs and FR have been cleaved out. This procedure reduces the number of steps required to 
perform CDR and/or FR exchange. 

20 EXPRESSION OF PROTEINS 

[0093] The engineered genes can be expressed in appropriate prokaryotic hosts such as various strains of E. coli, 
and in eucaryotic hosts such as Chinese hamster ovary cell, murine myeloma, and human myeloma/transfectoma cells. 
[0094] For example, if the gene is to be expressed in E. coli, it may first be cloned into an expression vector. This is 
25 accomplished by positioning the engineered gene downstream from a promoter sequence such as trp or tac, and a 
gene coding for a leader peptide. The resulting expressed fusion protein accumulates in retractile bodies in the cyto- 
plasm of the cells, and may be harvested after disruption of the cells by French press or sonication. The retractile 
bodies are solubilized, and the expressed proteins refolded and cleaved by the methods already established for many 
other recombinant proteins. 

30 [0095] If the engineered gene is to be expressed in myeloma cells, the conventional expression system for immu- 
noglobulins, it is first inserted into an expression vector containing, for example, the Ig promoter, a secretion signal, 
immunoglobulin enhancers, and various introns. This plasmid may also contain sequences encoding all or part of a 
constant region, enabling an entire part of a heavy or light chain to be expressed. The gene is transfected into myeloma 
cells via established electroporation or protoplast fusion methods. Cells so transfected can express V L or V H fragments, 

35 V L2 or V H2 homodimers, V L -V H heterodimers, V H -V L or V L -V H single chain polypeptides, complete heavy or light im- 
munoglobulin chains, or portions thereof, each of which may be attached in the various ways discussed above to a 
protein region having another function (e.g., cytotoxicity). 

[0096] Vectors containing a heavy chain V region (or V and C regions) can be cotransfected with analogous vectors 
carrying a light chain V region (or V and C regions), allowing for the expression of noncovalently associated binding 
40 sites (or complete antibody molecules). 

[0097] In the examples which follow, a specific example of how to make a single chain binding site is disclosed, 
together with methods employed to assess its binding properties. Thereafter, a protein construct having two functional 
domains is disclosed. Lastly, there is disclosed a series of additional targeted proteins. 

45 | EXAMPLE OF CDR EXCHANGE AND EXPRESSION 

[0098] The synthetic gene coding for murine V H and V L 26-10 shown in Figures 4A and 4B were designed from the 
known amino acid sequence of the protein with the aid of Compugene, a software program. These genes, although 
coding for the native amino acid sequences, also contain non-native and often unique restriction sites flanking nucleic 

50 acid sequences encoding CDR's to facilitate CDR replacement as noted above. 

[0099] Both the 3' and 5' ends of the large synthetic oligomers were designed to include 6-base restriction sites, 
present in the genes and the pUC vector. Furthermore, those restriction sites in the synthetic genes which were only 
suited for assembly but not for cloning the pUC were extended by "helper" cloning sites with matching sites in pUC. 
[0100] Cloning of the synthetic DNA and later assembly of the gene is facilitated by the spacing of unique restriction 

55 sites along the gene. This allows corrections and modifications by cassette mutagenesis at any location. Among them 
are alterations near the 5' or 3' ends of the gene as needed for the adaptation to different expression vectors. For 
example, a Pstl site is positioned near the 5' end of the V H gene. Synthetic linkers can be attached easily between this 
site and a restriction site in the expression plasmid. These genes were synthesized by assembling oligonucleotides 
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as described above using a Biosearch Mode. 8600 DMA Synthesizer. They were ligated to vector pUC8 for transfor- 
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Protein Design : 
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not to interfere with domain folding. Thus, the 15 residue sequence (Gly-Gly-Gly-Gly-Ser) 3 was selected to connect 
the V H carboxyl- and V L amino-termini. 

[0111] Binding studies with single chain binding sites having less than or greater than 15 residues demonstrate the 
importance of the prerequisite distance which must separate V H from V L ; for example, a (Gly^Ser)! linker does not 
5 demonstrate binding activity, and those with (Gly 4 -Ser) 5 linkers exhibit very low activity compared to those with (Gly 4 - 
Ser) 3 linkers. 

Gene Synthesis : 

10 [01 1 2] Design of the 744 base sequence for the synthetic binding site gene was derived from the Fv protein sequence 
of 26-10 by choosing codons frequently used in E. coli. The model of this representative synthetic gene is shown in 
Figure 8, discussed previously. Synthetic genes coding for the trp promoter-operator, the modified trp_ LE leader peptide 
(MLE), the sequence of which is shown in Figure 10A, and V H were prepared largely as described previously. The 
gene coding for V H was assembled from 46 chemically synthesized oligonucleotides, all 15 bases long, except for 

is terminal fragments (13 to 19 bases) that included cohesive cloning ends. Between 8 and 15 overlapping oligonucle- 
otides were enzymatically ligated into double stranded DNA, cut at restriction sites suitable for cloning (Narl, Xbal, 
Sail, Sacll, Sad), purified by PAGE on 8% gels, and cloned in pUC which was modified to contain additional cloning 
sites in the polylinker. The cloned segments were assembled stepwise into the complete gene mimicking V H by ligations 
in the pUC cloning vector. 

20 [01 1 3] The gene mimicking 26-1 0 V L was assembled from 1 2 long synthetic polynucleotides ranging in size from 33 
to 88 base pairs, prepared in automated DNA synthesizers (Model 6500, Biosearch, San Rafael, CA; Model 380A, 
Applied Biosystems, Foster City, CA). Five individual double stranded segments were made out of pairs of long synthetic 
oligonucleotides spanning six-base restriction sites in the gene (Aatll, BstEII, Ppnl, Hindlll, Bglll, and Pstl). In one 
case, four long overlapping strands were combined and cloned. Gene fragments bounded by restriction sites for as- 

25 sembly that were absent from the pUC polylinker, such as Aatll and BstEII, were flanked by EcoRI and BamHI ends 
to facilitate cloning. 

[0114] The linker between V H and V L , encoding (Gly-Gly-Gly-Gly-Ser) 3 , was cloned from two long synthetic oligo- 
nucleotides, 54 and 62 bases long, spanning Sad and Aatll sites, the latter followed by an EcoRI cloning end. The 
complete single chain binding site gene was assembled from the V H , V L , and linker genes to produce a construct, 

30 corresponding to aspartyI-prolyl-V H -(linker)-V L , flanked by EcoRI and Pstl restriction sites. 

[01 1 5] The trjD promoter-operator, starting from its Sspl site, was assembled from 1 2 overlapping 1 5 base oligomers, 
and the MLE leader gene was assembled from 24 overlapping 15 base oligomers. These were cloned and assembled 
in pUC using the strategy of assembly sites flanked by cloning sites. The final expression plasmid was constructed in 
the pBR322 vector by a 3-part ligation using the sites Sspl, EcoRI, and Pstl (see Figure 10B). Intermediate DNA 

35 fragments and assembled genes were sequenced by the dideoxy method. 

Fusion Protein Expression : 

[0116] Single-chain protein was expressed as a fusion protein. The MLE leader gene (Fig. 10A) was derived from 
40 1- coll trg LE sequence and expressed under the control of a synthetic trg promoter and operator. E. coli strain JM83 
was transformed with the expression plasmid and protein expression was induced in M9 minimal medium by addition 
of indoleacrylic acid (10 fig/ml) at a cell density with A 600 = 1 . The high expression levels of the fusion protein resulted 
in its accumulation as insoluble protein granules, which were harvested from cell paste (Figure 11, Lane 1). 

45 Fusion Protein Cleavage : 

[0117] The MLE leader was removed from the binding site protein by acid cleavage of the Asp-Pro peptide bond 
engineered at the junction of the MLE and binding site sequences. The washed protein granules containing the fusion 
protein were cleaved in 6 M guanidine-HCU 10% acetic acid, pH 2.5, incubated at -37°C for 96 hrs. The reaction was 
so stopped through precipitation by addition of a 10-fold excess of ethanol with overnight incubation at -20°C, followed 
by centrifugation and storage at -20°C until further purification (Figure 11 , Lane 2). 

Protein Purification : 

55 [01 1 8] The acid cleaved binding site was separated from remaining intact fused protein species by chromatography 
on DEAE cellulose. The precipitate obtained from the cleavage mixture was redissolved in 6 M guanidine-HCI + 0.2 
M Tris-HCI, pH 8.2, + 0.1 M 2-mercaptoethanol and dialyzed exhaustively against 6 M urea + 2.5 mM Tris-HCI, pH 7.5, 
+ 1 mM EDTA. 2-Mercaptoethanol was added to a final concentration of 0.1 M, the solution was incubated for 2 hrs at 
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Affinity Chromatography: 
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Cleavage yield (%) 
prior step 
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protein 


Cell paste 
Fusion protein 
Granules 

Acid Cleavage/DE 52 
pool 

Ouabain-Sepharose 
pool 


12.0 g 
2.3 g 


1440.0 mg a 
480.0 mg a ' b 

144.0 mg 
18.1 mg 


100.0% 
38.0 e 
12.6 d 


100.0% 
38.0 e 
4.7 e 
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©Percentage yield calculated on a molar basis 
^^^^.ycic of fiene and Protein: 

«* * UM A T "? Tfi^jtS. **» pro,* 1..ctior*<i by 
cut out and sequenced. 
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Specificity Determination : 

[01 22] Specificities of anti-digoxin 26-1 0 Fab and the BABS were assessed by radioimmunoassay. Wells of microtiter 
plates were coated with affinity-purified goat anti-murine Fab fragment (ICN ImmunoBiologicals, Lisle, IL) at 10 ug/mi 
in PBSA overnight at 4°C. After the plates were washed and blocked with 1% horse serum in PBSA, solutions (50 
containing 26-10 Fab or the BABS in either PBSA or 0.01 M sodium acetate at pH 5.5 were added to the wells and 
incubated 2-3 hrs at room temperature. After unbound antibody fragment was washed from the wells, 25 \l\ of a series 
of concentrations of cardiac glycosides (10~ 4 to 10" 11 M in PBSA) were added. The cardiac glycosides tested included 
digoxin, digitoxin, digoxigenin, digitoxigenin, gitoxin, ouabain, and acetyl strophanthidin. After the addition of 125 l-dig- 
oxin (25 ill, 50,000 cpm; Cambridge Diagnostics, Billerica, MA) to each well, the plates were incubated overnight at 
4°C, washed and counted. The inhibition curves are plotted in Figure 12. The relative affinities for each digoxin analogue 
were calculated by dividing the concentration of each analogue at 50% inhibition by the concentration of digoxin (or 
digoxigenin) that gave 50% inhibition. There is a displacement of inhibition curves for the BABS to lower glycoside 
concentrations than observed for 26-1 0 Fab, because less active BABS than 26-1 0 Fab was bound to the plate. When 
0.25 M urea was added to the BABS in 0.01 M sodium acetate, pH 5.5, more active sFv was bound to the goat anti- 
murine Fab coating on the plate. This caused the BABS inhibition curves to shift toward higher glycoside concentrations, 
closer to the position of those for 26-1 0 Fab, although maintaining the relative positions of curves for sFv obtained in 
acetate buffer alone. The results, expressed as normalized concentration of inhibitor giving 50% inhibition of 125 l- 
digoxin binding, are shown in Table 2. 

TABLE 2 



26-10 Antibody Species 


Normalizing Glycoside 


D 


DG 


DO 


DOG 


A-S 


G 


O 


Fab 


Digoxin 


1.0 


1.2 


0.9 


1.0 


1.3 


9.6 


15 




Digoxigenin 


0.9 


1.0 


0.8 


0.9 


1.1 


8.1 


13 


BABS 


Digoxin 


1.0 


7.3 


2.0 


2.6 


5.9 


62 


150 




Digoxigenin 


0.1 


1.0 


0.3 


0.4 


0.8 


8.5 


21 


D = Digoxin 


















DG = Digoxigenin 


















DO = Digitoxin 


















DOG = Digitoxigenin 


















A-S = Acetyl Strophanthidin 
















G = Gitoxin 


















O = Ouabain 
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Affinity Determination: 

[0123] Association constants were measured by equilibrium binding studies. In immunoprecipitation experiments, 
100 ul of 3 H-digoxin (New England Nuclear, Billerica, MA) at a series of concentrations (10" 7 M to 1 0' 11 M) were added 
to 1 00 jxl of 26-1 0 Fab or the BABS at a fixed concentration. After 2-3 hrs of incubation at room temperature, the protein 
was precipitated by the addition of 100 \l\ goat antiserum to murine Fab fragment (ICN ImmunoBiologicals), 50 u.1 of 
the IgG fraction of rabbit anti-goat IgG (ICN ImmunoBiologicals), and 50 uJ of a 10% suspension of protein A-Sepharose 
(Sigma). Following 2 hrs at 4°C, bound and free antigen were separated by vacuum filtration on glass fiber filters 
(Vacuum Filtration Manifold, Millipore, Bedford, MA). Filter disks were then counted in 5 ml of scintillation fluid with a 
Model 1500 Tri-Carb Liquid Scintillation Analyzer (Packard, Sterling, VA). The association constants, Kq, were calcu- 
lated from Scatchard analyses of the untransformed radioligand binding data using LIGAND, a non-linear curve fitting 
program based on mass action. K 0 s were also calculated by Sips plots and binding isotherms shown in Figure 13A for 
the BABS and 13B for the Fab. For binding isotherms, data are plotted as the concentration, of digoxin bound versus 
the log of the unbound digoxin concentration, and the dissociation constant is estimated from the ligand concentration 
at 50% saturation. These binding data are also plotted in linear form as Sips plots (inset), having the same abscissa 
as the binding isotherm but with the ordinate representing log r/(n-r), defined below. The average intrinsic association 
constant (K 0 ) was calculated from the modified Sips equation (39), log (r/n-r) = a log C - a log Kq, where r equals moles 
of digoxin bound per mole of antibody at an unbound digoxin concentration equal to C; n is the number of moles of 
digoxin bound at saturation of the antibody binding site, and a is an index of heterogeneity which describes the distri- 
bution of association constants about the average intrinsic association constant Least squares linear regression 
analysis of the data indicated correlation coefficients for the lines obtained were 0.96 for the BABS and 0.99 for 26-1 0 
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Fab. A summary of the calculated association constants are shown below in Table 3. 



10 



15 



20 



Method of Data Analysis 


Association Constant, Kq 


K 0 (BABS), M- 1 


K 0 (Fab), M" 1 


Scatchard plot 
Sips plot 
Binding isotherm 


(3.2±0.9)X10 7 
2.6 X10 7 
5.2 X10 7 


(1.9±0.2)X10 8 
1.8 X10 8 
3.3 X10 8 
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III S ynthesis of a Multifunctional Protein 

L X - FB »' P'»'* * « 2 3 'S:."^.1^^1.«ld dosage *w» 

irr^X"^:"^ sssks-* . — - -* — • 

Synthetic oligonucleotides are cloned according ^"™?^ fe then ressed in E . CO li. 
fected into E. coH. The completed fused 0™""°*" F«u» 6A isjhen e p jne hydrochlo . 

[0126] After sonication, inclus.on bod.es were protein was denatured and reduced in the 

ride (GuHCI), 0.2 M Tris, and 0.1 M 2-mercaptoethanol f^PHJJ™^ ■ fusion tein from lhe 

solvent overnight at room ternp^.u-Size ^^^^ of 6M GuHC. and 0.01 M NaOAc, P H 

suffering significant proteolytic degradation. GuHCI-Tris-BME solution, and dialysate 

[0127] For refolding, the protein was ; d.alyzed against 100 , mid the same ^ ^ ^ 

was diluted 11-fold over two days to 0.55 M °""^^*J^,^^e belr^ assayed by RIA's for binding of 125 l- 
transferredto0.01MNaCI,andthep« 

labelled digoxin. The refolding procedure can be wMwdby matanga rap m ^ phos _ 

showed binding activity, as indicated in Figure 7A. 
Demonstration of Bifunctionality : 

noassays were performed. mortification of an assay developed by Mudgett-Hunter et al. 

[0129] Properties of the binding srte were probed by a foVat it could be run on microliter plates 

J. Immunol. (1982) 129:1165-1172; Molec Immunol. ^^^SILne Fab antisera (gAmFab) as the 

L a solid phase sandwich assay. ^^J^f^^SlS anLra which recognize epitopes that 
primary antibody that initially coats the wells ofthep^ 

appeartoresidemostlyonframeworkre gi ons.Thesamples^ After washing away unbound protein, 

with the gAmFab, which binds species that exhibit appropriate an ^ ^ or 125,-dig-lysine. 

the welfare exposed to ^labelled (radio ^^^S^^ experiment in which the parent 
[0130] The data are plotted m Figure 7 ^ c ^^ f ^^ h i25, dig . B SA as described above, with a series 
26-10 antibody was included as a control. The sites were probed with 9 dilu ted/quickly refolded 

of dilutions prepared fromin,^ 

i^Sm^ 

to be the same for both proteins. dj , must be at | eas t 1 0 6 . Exper- 

[0131] The sensitivity of these assays is such that tan*ng affin J oune j g ^ antjbody 

[mental data on digoxin binding yielded bind.ng con tants « ^^JSw^^ and can be inhibited by 
has an affinity of 5.4 X 1tf» M-' . Inhibition assays also ind.cate the b.no.ng or g y 
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unlabelled digoxin, digoxigenin, digitoxin, digitoxigenin, gitoxin, acetyl strophanthidin, and ouabain in a way largely 
parallel to the parent 26-1 0 Fab. This indicates that the specificity of the biosynthetic protein is substantially identical 
to the original monoclonal. 

[0132] In a second type of assay, Digoxin-BSA is used to coat microtiter plates. Renatured BABS (FB-BABS) is 
5 added to the coated plates so that only molecules that have a competent binding site can stick to the plate. 125 l-labelled 
rabbit IgG (radioligand) is mixed with bound FB-BABS on the plates. Bound radioactivity reflects the interation of IgG 
with the FB domain of the BABS, and the specificity of this binding is demonstrated by its inhibition with increasing 
amounts of FB, Protein A, rabbit IgG, lgG2a, and lgG1 , as shown in Figure 7B. 

[01 33] The following species were tested in order to demonstrate authentic binding: unlabelled rabbit IgG and lgG2a 
10 monoclonal antibody (which binds competiviely to the FB domain of the BABS); and protein A and FB (which bind 
competively to the radioligand). As shown in Figure 7B, these species are found to completely inhibit radioligand bind- 
ing, as expected. A monoclonal antibody of the lgG1 subclass binds poorly to the FB, as expected, inhibiting only about 
34% of the radioligand from binding. These data indicate that the BABS domain and the FB domain have independent 
activity. 

15 

IV. OTHER CONSTRUCTS 

[0134] Other BASS-containing protein expressible in E. coli and other host cells as described above are set forth in 
the drawing. These proteins may be bifunctional or multifunctional. Each construct includes a single chain BABS linked 
via a spacer sequence to an effector molecule comprising amino acids encoding a biologically active effector protein 
such as an enzyme, receptor, toxin, or growth factor. Some examples of such constructs shown in the drawing include 
proteins comprising epidermal growth factor (EGF) (Figure 1 5A), streptavidin (Figure 1 5B), tumor necrosis factor (TNF) 
(Figure 15C), calmodulin (Figure 15D) the beta chain of platelet derived growth factor (B-PDGF) (15E) ricin A (15F), 
interleukin 2 (15G) and FB dimer (15H). Each is used as a trailer and is connected to a preselected BABS via a spacer 
(Gly-Ser-Gly) encoded by DNA defining a BamHI restriction site. Additional amino acids may be added to the spacer 
for empirical refinement of the construct if necessary by opening up the Bam HI site and inserting an oligonucleotide 
of a desired length having BamHI sticky ends. Each gene also terminates with a Pstl site to facilitate insertion into a 
suitable expression vector. 

[0135] The BABS of the EGF and PDGF constructs may be, for example, specific for fibrin so that the EGF or PDGF 
is delivered to the site of a wound. The BABS for TNF and ricin A may be specific to a tumor antigen, e.g., CEA, to 
produce a construct useful in cancer therapy. The calmodulin construct binds radioactive ions and other metal ions. 
Its BABS may be specific, for example, to fibrin or a tumor antigen, so that it can be used as an imaging agent to locate 
a thrombus or tumor. The streptavadin construct binds with biotin with very high affinity. The biotin may be labeled with 
a remotely detectable ion for imaging purposes. Alternatively, the biotin may be immobilized on an affinity matrix or 
solid support. The BABS-streptavidin protein could then be bound to the matrix or support for affinity chromatography 
or solid phase immunoassay. The interteukin-2 construct could be linked, for example, to a BABS specific for a T-cell 
surface antigen. The FB-FB dimer binds to Fc, and could be used with a BABS in an immunoassay or affinity purification 
procedure linked to a solid phase through immobilized immunoglobulin. 

[0136] Figure 14 exemplifies a multifunctional protein having an effector segment as a leader. It comprises an FB-FB 
dimer linked through its C-terminal via an Asp-Pro dipeptide to a BABS of choice. It functions in a way very similar to 
the construct of Fig. 15H. The dimer binds avidly to the Fc portion of immunoglobulin. This type of construct can 
accordingly also be used in affinity chromatography, solid phase immunoassay, and in therapeutic contexts where 
coupling of immunoglobulins to another epitope is desired. 

[01 37] In view of the foregoing, it should be apparent that the invention is unlimited with respect to the specific types 
single polypeptide chains and linkers as well as the types of BABS and effector proteins to be linked. Accordingly, other 
embodiments are within the following claims. 

[0138] Also contemplated is a biosynthetic binding protein expressed from DNA derived by recombinant techniques 

said binding protein comprising a single polypeptide chain comprising at least two polypeptide domains connected 
50 by a polypeptide linker, the amino acid sequence of each of said polypeptide domains comprising a set of CDRs 

interposed between a set of FRs, each of which is respectively homologous with at least a portion of CDRs and 
FRs from an immunoglobulin molecule, 

said polypeptide linker comprising plural, peptide-bonded amino acids defining a polypeptide of a length sufficient 
to span the distance between the C-terminal end of one of said domains and the N-terminal end of the other of 
55 said domains when said binding protein assumes a conformation suitable for binding, and comprising hydrophilic 

amino acids which together assume an unstructured polypeptide configuration in aqueous solution, 
said binding protein being capable of binding to a preselected antigenic site, determined by the collective tertiary 
structure of said sets of CDRs held in proper conformation by said sets of FRs and said linker when disposed in 
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aqueous solution. 



Claims 

1 . A single polypeptide chain comprising a linking sequence of a length of at least 1 0 amino acid residues, the linking 
sequence connecting a first and a second non-naturally peptide-bonded, biologically active polypeptide domain 
to form a single polypeptide chain comprising at least two biologically active domains connected by said linking 
sequence, said linking sequence comprising hydrophilic peptide-bonded amino acids exhibiting small and unre- 
active side chains but no cysteine, the hydrophilic amino acids constituting a hydrophilic sequence having a flexible 
unstructured configuration essentially free of secondary structure in aqueous solution, the linking sequence having 
a plurality of glycine or serine residues and spanning the distance between the C-terminal end of the first domain 
and the N-terminal end of the second domain. 

2. The polypeptide chain of claim 1 , wherein said linking sequence comprises threonine. 

3. The polypeptide chain of claim 1 or 2, further comprising said first domain connected by a peptide bond to said N- 
terminal end of said linking sequence and a second domain connected by a peptide bond to the C-terminal end 
of said linking sequence. 

4. The polypeptide chain of claim 1 , wherein said linking sequence comprises plural consecutive copies of an amino 
acid sequence. 

5. The polypeptide chain of claim 4, comprising the amino acid sequence (GlyGiyGlyGlySer^. 

6. The polypeptide chain of claim 1 , wherein said linking sequence comprises one or a pair of amino acid sequences 
recognizable by a site-specific cleavage agent. 

7. DNA encoding the polypeptide chain of any of the preceding claims. 

8. A polypeptide linker, the linker having a length of at least 10 amino acid residues and linking two non-naturally 
linked polypeptide domains to form a multifunctional protein, said linker exhibiting amino acids with small and 
unreactive side chains and comprising plural hydrophilic peptide-bonded amino acids constituting a hydrophilic 
sequence, said linker spanning the distance between the C-terminal end of a first domain and the N-terminal end 
of a second domain, wherein each said domain comprises a biologically active polypeptide having a conformation 
suitable for biological activity independent of the biological activity of the other domain. 

9. A polypeptide linker, the linker having a length of at least 10 amino acid residues and linking two non-naturally 
linked polypeptide domains to form afunctional protein, said linker exhibiting amino acids with small and unreactive 
side chains and comprising plural hydrophilic peptide-bonded amino acids constituting a hydrophilic sequence, 
said linker spanning the distance between the C-terminal end of a first domain and the N-terminal end of a second 
domain, wherein said domains together comprise an immunologically reactive binding site specific for a preselected 
antigen. 

1 0. The polypeptide linker of claim 9, wherein said two domains mimic a v H and v L chain from a natural immunoglobulin. 

11 . The polypeptide linker of claim 8 or 9, which linker 

(a) comprises threonine, or 

(b) is cysteine-free, or 

(c) comprises a plurality of glycine or serine residues, or 

(d) comprises plural consecutive copies of an amino acid sequence, or 

(e) spans a distance of at least 4 nm (40 Angstroms), or 

(f) comprises the amino acid sequence GlyGlyGlyGlySerGlyGlyGlyGlySerGlyGlyGly GlySer, or 

(g) comprises one amino acid sequence or a pair of amino acid sequences recognizable by a site-specific 
cleavage agent. 

12. The polypeptide linker of claim 8, wherein at least one of said domains comprises an enzyme, a toxin, a receptor, 
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a binding site, a biosynthetic antibody binding site, a growth factor, a cell-differentiation factor, a lymphokine, a 
cytokine, a hormone, a remotely detectable moiety, or an anti-metabolite. 

13. The polypeptide linker of claim 8, wherein said first domain comprises a single chain binding site and said second 
5 domain comprises an enzyme, a toxin, a receptor, a binding site, a biosynthetic antibody binding site, a growth 

factor, a cell-differentiation factor, a lymphokine, a cytokine, a hormone, or an anti-metabolite. 

14. The polypeptide linker of claim 8, wherein at least one of said domains comprises a polypeptide capable of se- 
questering an ion. 

1 ° 

15. The polypeptide linker of claim 14, wherein the polypeptide comprises calmodulin, methallothionein, a fragment 
thereof, or an amino acid sequence rich in at least one of glutamic acid, aspartic acid, lysine, and arginine. 

16. The polypeptide linker of claim 8 or 9, wherein the amino acids of said linker together assume an unstructured 
15 polypeptide configuration in aqueous solution. 

17. DNA encoding the polypeptide linker of claim 8 or 9. 

18. Host cell transformed with and capable of expressing the DNA of claim 17. 

20 

Patentanspruche 

1. Eine einzelne Potypeptid-Kette, umfassend eine verbindende Sequenz mit einer Lange von wenigstens 10 Ami- 
25 nosaure-Resten, wobei die verbindende Sequenz eine erste und zweite nicht-naturliche peptid-gebundene biolo- 

gisch aktive Polypeptid-Domane unter Bildung einer einzelnen Polypeptid-Kette verbindet, wobei die einzelne 
Polypeptid-Kette wenigstens zwei biologisch aktive Domanen, verknupft uber die verbindende Sequenz, umfaBt, 
wobei die verbindende Sequenz hydrophile peptid-gebundene Aminosauren mit kleinen und nicht-reaktiven Sei- 
tenketten, aber kein Cystein umfaBt, und wobei die hydrophilen Aminosauren eine hydrophile Sequenz mit einer 
30 flexiblen unstrukturierten Konfiguration, die in waBriger Losung im Wesentlichen frei von Sekundarstrukturen ist, 

ausbilden, und wobei die verbindende Sequenz eine Vielzahl von Glyzin- oder Serin-Resten enthalt und den Ab- 
stand zwischen dem C-terminalen Ende der ersten Domane und dem N-terminalen Ende der zweiten Domane 
uberbruckt. 

35 2. Die Polypeptid-Kette nach Anspruch 1 , wobei die verbindende Sequenz Threonin umfaBt. 

3. Die Polypeptid-Kette nach Anspruch 1 oder 2, weiterhin umfassend die erste Domane, mittels Peptid-Bindung 
verknupft mit dem N-terminalen Ende der verbindenden Sequenz, und eine zweite Domane, verknupft mittels 
Peptid-Bindung mit dem C-terminalen Ende der verbindenden Sequenz. 

40 

4. Die Polypeptid-Kette nach Anspruch 1 , wobei die verbindende Sequenz eine Vielzahl aufeinanderfolgender Kopien 
einer Aminosaure-Sequenz umfaBt. 

5. Die Polypeptid-Kette nach Anspruch 4, umfassend die Aminosaure-Sequenz (GlyGlyGlyGlySer) 3 . 

45 

6. Die Polypeptid-Kette nach Anspruch 1 , wobei die verbindende Sequenz eine oder ein Paar von Aminosaure-Se- 
quenzen umfaBt, die von einem ortsspezifischen Spaltungsmittel erkannt wird/werden. 

7. DNA, kodierend fur die Polypeptid-Kette nach einem der vorhergehenden Anspruche. 

50 

8. Ein Polypeptid-Linker mit einer Lange von wenigstens zehn Aminosaure-Resten, der zwei naturlicherweise nicht 
verbundene Polypeptid-Domanen unter Bildung eines multifunktionalen Proteins miteinander verbindet und Ami- 
nosauren mit kleinen und nicht-reaktiven Seitenketten sowie mehrere hydrophile Peptid-gebundene Aminosauren, 
die eine hydrophile Sequenz ausbilden, umfaBt, wobei der Linker den Abstand zwischen dem C-terminalen Ende 

55 einer ersten Domane und dem N-terminalen Ende einer zweiten Domane uberbruckt, und wobei jede dieser beiden 

Domanen ein biologisch aktives Polypeptid mit einer fur eine biologische Aktivitat unabhangig von der biologischen 
Aktivitat der anderen Domane geeignete Konformation aufweist. 



21 



EP 0 623 679 B1 

9. Ein Polypeptid-Linker mit einer Lange von wenigstens zehn Aminosaure-Resten, der zwei naturlicherweise nicht 
verbundene Polypeptid-Domanen unter Bildung eines funktionalen Proteins miteinander verbindet und Aminosau- 
ren mit kleinen und nicht-reaktiven Seitenketten sowie mehrere hydrophile Peptid-gebundene Aminosauren, die 
eine hydrophile Sequenz ausbilden, umfaBt, wobei der Linker den Abstand zwischen dem C-terminalen Ende einer 

5 ersten Domane und dem N-terminalen Ende einer zweiten Domane uberbruckt, und wobei die Domanen gemein- 

sam eine immunologisch reaktive Bindungsstelle, spezifisch fur ein bestimmtes Antigen, umfassen. 

10. Der Polypeptid-Linker nach Anspruch 9, wobei die beiden Domanen eine V H - bzw. eine V L -Kette eines naturlichen 
Immunglobulins imitieren. 

10 

11. Der Polypeptid-Linker nach Anspruch 8 oder 9, wobei der Linker 

(a) Threonin umfaBt, oder 

(b) frei von Cystein ist, oder 

15 (c) eine Vielzahl von Glycin-oder Serin-Resten umfaBt, oder 

(d) mehrere aufeinander folgende Kopien einer Aminosaure-Sequenz umfaBt, oder 

(e) einen Abstand von wenigstens 4 nm (40 Angstrom) uberbruckt, oder 

(f) die Aminosaure-Sequenz GlyGtyGlyGlySerGlyGlyGIyGlySerGlyGlyGlyGlySer umfaBt, oder 

(g) eine Aminosaure-Sequenz oder ein Paar von Aminosaure-Sequenzen, die von einem ortspezifischen Spal- 
20 tungsmittel erkannt wird/werden, umfaBt. 

12. Der Polypeptid-Linker nach Anspruch 8, wobei mindestens eine der beiden Domanen ein Enzym, ein Toxin, einen 
Rezeptor, eine Bindungsstelle, eine Bindungsstelle eines biosynthetischen Antikorpers, einen Wachstumsfaktor, 
einen Zelldifferenzierungsfaktor, ein Lymphokin, ein Cytokin, ein Hormon, eine indirekt nachweisbare Einheit oder 

25 einen Anti-Metaboliten umfaBt. 

13. Der Polypeptid-Linker nach Anspruch 8, wobei die erste Domane eine Einzelketten-Bindungsstelle und die zweite 
Domane ein Enzym, ein Toxin, einen Rezeptor, eine Bindungsstelle, eine Bindungsstelle eines biosynthetischen 
Antikorpers, einen Wachstumsfaktor, einen Zelldifferenzierungsfaktor, ein Lymphokin, ein Cytokin, ein Hormon 

30 oder einen Anti-Metaboliten umfassen. 

14. Der Polypeptid-Linker nach Anspruch 8, wobei mindestens eine der beiden Domanen ein Polypeptid umfaBt, das 
ein Ion maskieren kann: 

35 15. Der Polypeptid-Linker nach Anspruch 14, wobei das Polypeptid Calmodulin, Methallothionein, ein Fragment davon 
oder eine Aminosaure-Sequenz umfaBt, die reich ist an zumindest einer der Aminosauren Glutaminsaure, Aspa- 
raginsaure, Lysin und Arginin. 

1 6. Der Polypeptid-Linker nach Anspruch 8 oder 9, wobei die Aminosauren des Linkers in waBriger Losung gemeinsam 
40 eine unstrukturierte Polypeptid-Konfiguration annehmen. 

17. DNA, kodierend fur den Polypeptid-Linker nach Anspruch 8 oder 9. 

18. Wirtszelle, transformiert mit und befahigt zur Expression der DNA nach Anspruch 17. 

45 

Revendications 

1. Une chaine polypeptidique unique comprenant une sequence de liaison d'une longueur de 10 residus d'acides 
so amines au moins, la sequence de liaison reliant un premier et un deuxieme domaines polypeptidiques biologique- 

ment actifs non lies de facon peptidique a I'etat naturel, de maniere a former une chaine polypeptidique unique 
comprenant au moins deux domaines biologiquement actifs relies par ladite sequence de liaison, ladite sequence 
de liaison comprenant des acides amines hydrophiles lies de fagon peptidique presentant des chaines laterales 
courtes et non reactives mais aucune cysteine, les acides amines hydrophiles constituant une sequence hydrophile 
55 presentant une configuration non structuree flexible essentiellement depourvue de structures secondaires en so- 

lution aqueuse, la sequence de liaison ayant une pluralite de residus glycine ou serine et couvrant la distance 
entre I'extremite C-terminale du premier domaine et I'extremite N-terminale du second domaine. 
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2. La chaTne polypeptidique selon la revendication 1 , caracterisee en ce que ladite sequence de liaison comprend 
de la threonine. 

3. La chaine polypeptidique selon la revendication 1 ou 2, comprenant en outre ledit premier domaine lie par une 
s liaison peptidique a ladite extremite N-terminale de ladite sequence de liaison et un second domaine lie par une 

liaison peptidique a I'extremite C-terminale de ladite sequence de liaison. 

4. La chaine polypeptidique selon la revendication 1 , caracterisee en ce que ladite sequence de liaison comprend 
plusieurs copies consecutives d'une sequence en acides amines. 

10 

5. La chaine polypeptidique selon la revendication 4, comprenant la sequence d'acides amines (GlyGlyGlyGlySer) 3 . 

6. La chaine polypeptidique selon la revendication 1 , caracterisee en ce que ladite sequence de liaison comprend 
une sequence ou une paire de sequences d'acides amines reconnaissable(s) par un agent de clivage site-speci- 

15 fique. 

7. ADN codant pour la chaine polypeptidique selon Tune quelconque des revendications precedentes. 

8. Un agent de couplage polypeptidique, l'agent ayant une longueur d'au moins 1 0 residus d'acides amines et reliant 
20 deux domaines polypeptidiques non naturellement lies, de maniere a former une proteine multi-fonctionnelle, ledit 

agent presentant des acides amines avec des chaines laterales courtes et non reactives et comprenant plusieurs 
acides amines hydrophiles lies de fagon peptidique constituant une sequence hydrophile, ledit agent couvrant la 
distance entre I'extremite C-terminale d'un premier domaine et I'extremite N-terminale d'un second domaine, cha- 
cun desdits domaines comprenant un polypeptide actif sur le plan biologique presentant une conformation adaptee 
25 pour une activite biologique independante de I'activite biologique de I'autre domaine. 

9. Un agent de couplage polypeptidique, I'agent ayant une longueur d'au moins 1 0 residus d'acides amines et reliant 
deux domaines polypeptidiques non naturellement lies, de maniere a former une proteine fonctionnelle, ledit agent 
presentant des acides amines avec des chaines laterales courtes et non reactives et comprenant plusieurs acides ' 

30 amines hydrophiles lies de fagon peptidique constituant une sequence hydrophile, ledit agent couvrant la distance 

entre I'extremite C-terminale d'un premier domaine et I'extremite N-terminale d'un second domaine, lesdits domai- 
nes comprenant ensemble un site de liaison reactif sur le plan immunologique et specifique pour un antigene pre- 
selectionne. 

35 1 o. L'agent polypeptidique selon la revendication 9, caracterise en ce que lesdits deux domaines miment une chaine 
V H et V L d'une immunoglobuline naturelle. 

11. L'agent polypeptidique selon la revendication 8 ou 9, ledit agent 

40 (a) comprend de la threonine, ou 

(b) est depourvu de cysteine, ou 

(c) comprend plusieurs residus glycine ou serine, ou 

(d) comprend plusieurs copies consecutives d'une sequence d'acides amines, ou 

(e) couvre une distance d'au moins 4 nm (40 Angstroms), ou 

45 (f) comprend la sequences d'acides amines GlyGlyGlyGlySerGlyGlyGlyGlySerGIyGlyGlyGlySer, ou 

(g) comprend une sequence d'acides amines ou une paire de sequences d'acides amines reconnaissable(s) 
par un agent de clivage site-specifique. 

12. L'agent polypeptidique selon la revendication 8, caracterise en ce qu'au moins I'un desdits domaines comprend, 
so un enzyme, une toxine, un recepteur, un site de liaison, un site de liaison a un anticorps biosynthetique, un facteur 

de croissance, un facteur de differentiation cellulaire, une lymphokine, une cytokine, une hormone, une portion 
detectable amovible, ou un anti-metabolite. 

13. La liaison polypeptidique selon la revendication 8, caracterisee en ce que ledit premier domaine comprend un 
55 site de liaison a une chaine unique et ledit second domaine comprend un enzyme, une toxine, un recepteur, un 

site de liaison, un site de liaison biosynthetique d'anticorps, un facteur de croissance, un facteur de differentiation 
cellulaire, une lymphokine, une cytokine, une hormone, ou un anti-metabolite. 
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14. L'agent polypeptidique selon la revendication 8, caracterise en ce qu'au moins Tun desdits domaines comprend 
un polypeptide capable de sequestrer un ion. 

15. L'agent polypeptidique selon la revendication 14, caracterise en ce que le polypeptide comprend la calmoduline, 
5 la methallothion^ine, un fragment de celles-ci, ou une sequence d'acides amines riche en I'un au moins des acides 

amines acide glutamique, acide aspartique, lysine et arginine. 

1 6. L'agent polypeptidique selon la revendication 8 ou 9, caracterise en ce que les acides amines dudit agent prennent 
ensemble une configuration polypeptidique non structuree en solution aqueuse. 

10 

17. ADN codant pour l'agent polypeptidique selon la revendication 8 ou 9. 

18. Cellule note transformee avec et capable d'exprimer I'ADN selon la revendication 17. 

15 
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tO 20 30 «0 SO 60 TO 

GAATTCGAAGTTCAACTGCAGCACTCTGCTCC7GAATTGGTTAAACCTGGCGCCTCTGTGCCCATGTCCT 
CXufh«GluValCInt«uClnClnS«rGlyFroCluLtuValLytfroClyAlA5«rVcXAftMtt5trC 

AauII Bbvl Avail AbaXX HhaX 

CeoIX rau«HX Sa«96I Baal Htnfl 

TaqX f»tl EeolXI MstXNXtXXX 

HatXX Fapl 
HhaX 
KlttPX 
Karl 
MXaXV 
SerFI 
Acyl 

80 90 100 110 120 130 140 

GCAAATCCTCTGGGTACATTTTCACCGACTTCTACATCAATTGGCTTCOCCAGTCTC ATCCTAACTCTCT 
v«LvsS«rSerGXvTyrllePheTnr AapPheTyrHet Aan Trp VaX ArgGlnSerHlaOX ylyaSarLe 
RaaX Mphj NXalll BalX] Nlalll Xba 

Ha 

150 160 170 ISO 190 200 210 

AGACTACATCCCGTACATTTCCCCATACTCTCCCCTTACCCGCTACAACCAGAAGTTTAAAGCTAAGCCC 
uAapTyr XI eCly Tyr II cSer ProTyrSerGly VaXThrOl yTyrAsnCl nly> Phc LyaGXyLya AX a 
X RaaX BatEXl Oral 

aX MpaXI 

MaaXIX 

220 230 2^0 2S0 260 270 280 

aCCCTTACTCTCGACAAATCTTCCTCAACTGCTTACATGCAGCTCCCTTCTTTGACCTCTGAGGACTCCG 
ThrLeuThr ValAsplysSerSerSerThr AlaTyr Ht tGXuLeuArgSer l«uThrS«rGXuAap$#r A 
Aeel HboXX AXuI Ode X HlnfXFn 

Hindi UlaXXXBbvI Sac 

SaXI FnuftHI 
Ta«X 

290 300 310 320 330 3*0 350 

CCGTATACTATTGCGCGGGCTCCTCTGCTAACAAATGCGCCAT6GATTACTGCCCTCATGGCCCCTCTGT 
lMtfMiTvpTyrCYaAXaGXySar SerGXyAawLyaTrpAXaMet AapTyrTrpGXyHlaCXyAXaSarVa 
uDIX Hhal Btntl MaaXXX HaaXXX AbaXX Ma 

XXAccX FttttSH HaoX 

HiaPIRXaXV MlaXXX 

Sau96X 
Sty I 

360 370 
TACTGTATCCTCATACCATCC 
XThrVaXSer$er»ittAjp 
«IXX fiaaHl 

Sau3A r^"' 
XhoII 
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10 20 30. »0 50 60 TO 

CAATTCCACCTCCTAATCACCCACACTCCCCTCTCTCTCCCCOTTTCTCTCCCTCACCACCCTTCTATTT 
6XuPh«Asp»*l¥»lM«tThrClnTWProL«uStrt«uProV»XS«rL«u01»A»pClnAltS«rXi«S 

BMlXmg HlnfX IM1X K5".«« 

TaqX scrFI 
ten MidXXI 

MlftlX 

80 90 100 110 . 120 130 1«0 

CTTCCCOCTCTTCCCACTCTCTCCTCCATTCTAATCCTAACACTTACCTOAACTCCTACCTOCAAAACCC 
^fv^Argg^rg^rc mserLeuyAXHlaSerAsnGXyAanThrTyr fuAsnTrpTyrLeuClnLyaAl 

FnuTHT 171x1 HaaXXX MilEIX 04fll 

HftoII BaUX §£S4 

SAU96X HlaXV 

Raal 

150 160 tTO ISO 190 200 210 

TCCTCAOTCTCCOAACCTTCTCATCT.ACAAAOTCTCTAACCCCTTCTCTCCTCTCCCCCATCCTTTCTCT 
Mftiyfttfto^P^aLvaL^uLtullcTyr LysVAXSerAsnArgPhaSer CXyVaXProAapArgPhtST 

AluX SAU3A ^^^41^ *. 

HinflXIX 5 if * 

— — Scrri 

220 230 2«0 250 260 270 260 

CCTTCTCCTTCTCCTACTCACTTCACCCTCAAOATCTCTCCTCTCCACCCCCACCATCTCCCTATCTACT 
CXyS«rClyS«rClyThrA»pPneThrLtuLr«XXtS«rAPgValCXuAX*CluA»pL«uCXyXliTyrf 
RaaX HphX Bglll TaqXHaeXXX Sau3A 

HboII XhOXX 

XhoII 

290 300 310 320 330 3*0 330 

TCTCCTCTCACACTACTCATOTACCCCCCACCTTCCCCOCTCCCACCAACCTCCACATCAAACCTTCACCATCC 
h#cvii^#rCln ThrThrHlftVlPPoPro TRrPRtCXyGlyCXyThrUy»LeuCXuXXeUysAPfop 

DdeX NialXX HgiEXX BanX AXttI Sau3A MaalX BamHI 

tual HXaXY Aval MXaXY 

TaqX Sau3 A 

XhoX XhoII 
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10 20 30 40 50 60 70 

C1ATTCCAACTTCAACTCCACCACTCTCCTCCTCAATTCOTTAA1CCTCGC0CCTCTCT0CCCATCTCCT 
CluPhtCiuV*X01ntfuClnCXnS«rCIyfroCluLtuVtlly9ProClyAJlAS«rVilArtHttS«rC 

AauZX BbtX Aral! AltaXX HhaX 

EcoRI fnu4HI Sau96X ImI HlnPX 

TaqX P»tX EeoRII HatXNlaXXX 

MaaXX FspX 
HhaX 
HinPI 
Marl 
HXaXV 
AcyX 

BO 90 100 110 120 130 140 

GCAAATCC7CTC00 7ACA7TTTCACCAATTACTACATCC A7TGGGT7C0CC AGTCTCA TGGTAACTCTCT 

C A T C 7 AAAAC ? CCTtAAT<;ATCtA6GTAACiC AAGCCC7C 
ysLyjSerSerClyTyrllaPneTnrAanTyrTyrllcHiaTrpValArgClnSerHlsClyLyaSerte 
R«I HpM TORI Batxi NlaXII Xba 

Ha 

150 160 170 180 190 200 210 

ACACTAC A7CCCCTCCATCTACCCCCCTAATCCTAACACTAACTACTACAATCACAACTTT AAACCTAAC 

TCATCTCTCCC ACCTACATCCCCCCATTACC ATTCTCATTC ATCATC7TACTCTTCAA A 
uAapTyrlieGlyTrpileTyrProClyAanGlyAanThPLyaTyrTyPASftCiuAanPneLyaCXyLya 
X Sau3A AvaX MaeXIIDdeXRBaX Bral 

• X XhoXX HpaXX ScaX 

NelX 

NeiX 
SaaX 
XaaX . 

220 230 240 250 260 270 280 

GCGACCCTTACT6TCGACAAATCTTCCTCAACTGCTTACATCGAGCTCCC7TCTTTCACC7CTCACGACT 
AXaThrLeuThrValAapLysSarSarSarThrAXaTyrrtatGXutauArgSarLeuThrScrGXuAapS 
AeeX HboXX AXuX DdeX Hlnf 

HlncII NXaXXXBbvX 
SaXX Fnu4HX 
TaqX 

290 300 310 320 330 340 350 

CCGCGGTATACTATTCCGCGGGCTCCTCTGCTAACAAAT GGGCCTTCGATTACTGCGGTCATGG CCCCTC 

GGAAOCTAATGACCCCAGTACCCC 

•r AXaValTyrTyrCysAXaGXySerSarGXyAanLyaTrpAXaPheAapTyrrrpGXyHiaGXyAlaSa 
X AecX KhaXBanXX NaaXXX HaeXXX AhaXX 

PnuOXX rnuOXX Sau96ITaqX BanX 

SacXX HtnPXNXaXV — HaaXX 

KhaX 
HlnPX 

360 370 NarX 

TGTTACTOTATCCTCATAGGATCC NlaXXX . 

rVaX7hrVaXSerSer«aa NXaXV 
MaeXXX BaoHi AeyX 

xnoxx 
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10* 20 30 »0 50 «0 70 

CAATTCCICCTCCTAATGACCCAClCTCCCCTCTCTCTCCCCGTTTCTCTCGCTCACCACCCTTCTmT 

**H rl HMIXX 
Hatll 

$0 90 100 110 120 130 1*0 

CTTCCCCCTCTT CCCACTCTATTCTOCACTCTAITCCTAACAC TTACCTCCATTCCTACCTCCAAAACCC 
- 4i^flc&ACiAGCC TCACA TAACACCTCAGATTACCAT TGTCAAT0GACCTAAC 

Fnu*HI HgiAI MitHI EcpRXX B*nX 

HuoiT Scrfl KpnX 

S^Sii KfiEXX nTHy 

RSftl 

150 160 1T0 100 190 200 210 

TCCTCACTC7CCCAACCTTCTCATCTACAAACTCTCTAACCCCTTCTCTCCTCTCCCCCATC0TTTCTCT 
4ClyGlnS«rProLyaL«ut€uXI«TyrUy»ValStrAsoArgPh«SerqiyV*lProAapAr«Ph«Str 

AluX S*vi3A * V *l l 
HindXXX »c*XSau3A 

SerFX 

220 230 210 250 260 270 280 

QGTTCTOGTTCTGGTACTGACTTCACCC^ 

GXyS€rGlySerClyThrAspPheThrLeuLy5ll«S#rAr«VtlGluAl»GluAspUeuGXyXXtTrrT 
Ra»I Hphl B«1XX TMXJUcIIX S»u3A 

mi>oxx xnoii 

S*u3A 
XhoXX 

290 300 310 320 330 3*0 350 

AeTCCTTCC AGGCGTCTCATGTACCGTGCACCTTCGGCCGTGGCACCAAGC TCGAGATCAAACGTTCAGCATCC 

TGAgflA ACdTCCCCAOAGTACATGGCACCTGGAAOCCGC CACCGTOOTTCGAGCT 

EcoRII HXAXXX Avail 0*nl Uul S*u3A Mail! 0«HX 

SerFX $*u96X MX»XV A*»I »UIV 

HgiEXI T*QX v !!tt 

Xftol XhoXI 
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10 20 30 *0 50 60 TO 

CAATTCATCCAA0TACA1CTCCAACAATCTCCCCCCCCTCTCCTACCTCCCTCTCACACTCTCTCCCTCA 
CluPh«H«tClu7alClnL«uClnClnS«rClyProClyLtuV«lAriProS«rCXfiThrL«uS«rL«ut 
EeoBXHlaXXX laal ApalKpall Baal D4#IHinfX 

IirH K««tX Tthllll 

HaaUX 
Mcit 
Ilia IV 
3iu96I 
Sau961 
Serf! 

SO 90 100 110 120 ^30 140 

CTTCTACCCTATCCCCATCCACCTTCTCTAACTACTACATCCATTCCCTCCCTCAACCCCCCCCTCOTCC 
hrCYaThrValSftrClySTThrPheScrAaft TyrTyrlleHla TrpVlArgClnProProClyArtCl 
Rati BamHJ Fokl AvaXI Hlnclt HpaXI 

HpaTI Nlaiv HciX 

NlaXV Sau96X ScrFI 

SauJA 
XhoII 

ISO 160 170 180 190 200 210 

TCTCGAGTGCATCCCTTGGATTTACCCCCGTAATGCTAACACTAAGTACTAC A ATGACAACT7TAAAGGC 
yLtuGluTrpIlaCly TrpIleTyrProGly MnGl yAsnThr LyaTy rTy r AanOlu Asn PheLyaGly 
Aval SauJA Aval Hat IX IPde Ifiaa I Oral M 

TaqI HpaXX Seal Sp 

Xhol Hell 

Net X 
SerFX 

ScrFI 
SsaX 
XaaX 

220 230 240 250 260 270 280 

ATCCTCCTCCACACTTCTAACAACCAATTCTCTCTCCCTCTCTCTTCTCTTACCCCCCCTCATACTCCTC 
HctLauVaXAapThrSarLyaAanClnPheS«rL«uAP|teuSerSerValThrAlaAlaAapThrAlaV 
laXXX AeeX DdalXanI HgaX Nboll Hae IIXFnuiHI 

HI HlncIX BbvII FnuDIX 

SalX SacIX 

HTIql 

290 300 310 320 330 3«0 350 

TCTACTACTCCCCCCCTTCCTCCOOTAATAAOTCCCCATTTCATTACTCCCCCCACCCCTCTCTCOTCAC 
alTyrTvrCyaAlaAra ScrSerGlyAanLyaTrpAlaPheAapTyrTrpGly GlriGlyScrlcuValTh 
RaaX BsaHXX Hpall NlalV Ban 1 1 BatEXI 

FnuOll tcoRXI Mphl 

FnuOXX HaelXX HaaXXX 

Khal Sau96X 

Hhal ScrFt 
HlnPI l#p 
HinPX piG i. 

360 370 
CGTATCCTC7TA ACTGCAG 
r ValSerSer'ocLauGln 
PatX 
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10 20 30 «0 SO 60 70 

ClATTCATCCaATCTCTTCTCACTCACCCCCCCTCTCTATCTCCTCClCCCCCTCAACCCCTAACTATCT 
CluPhtMetCluStrV«lLtuThrClnProProS«rV»lSerC)yAl«ProClyClnAr|V»lThrIltS 
Eeeil Hlnfl 0dtZfnu«MX HglAXHpaXX FnuDXI 

Hliltl Hlnfl HolIHlneXX MaaXIX 

X.nl Scrri HluX 

60 90 100 UO 120 130 1*0 

CTTCCCCTTCCTCTCACTCTATTCTCCATTCTAATCCCAACACTTATCTCCAATCCTACCAACAACTCCC 
crCys Ar* ScrScr GlnSer IleVlHla Ser AanGXyAsnThr Tyr LeuOXu TrpTyrGlnGlnltuPr 
Odtl BstXl BanX Hp 

Konl No 
MialV Se 
RaaX 

150 160 170 160 190 200 210 

C0CCACCCCCCCCAACCTCCTCATC7TTAAACTATCTAATCCCTTCTC7CCCCTACCCCATCCATTCTCT 
pGXyThrAlaProLvaLtuiiuXXtFh« ty»ValSerA3nArtPheSerOXy VaX ProAap ArgPhaStr 
»ll FnuDXX AXul Dr»I Baal Clal 

11 Hhal BbvX SaujA HpalX KlafX 

rTX HlnPX FnuaHl Sau3A 

Banl TaqX 

HlalV 

220 230 2*0 250 260 270 260 

CTA7CTAAC7C76CCTCC7CTCCCACTCTCCCCATCAC7CC7CTCCAACCACAACATCACCCCCATTACT 
V«lSerUyaS«rClySerS«f AliThrUuAXa UeThrGly LeuGXnAXaCXuAapGXuAXaAipTyrT 
D4«X HlalV B(1X Sa«3A HboII HactXX 

290 300 310 320 330 3*0 350 

ACTCTTTTCAAOCCTCTCATCTACCCTCCACCTTCCCTCCTOCCACCAACCTTACTCTACTCCCTCACCC 
YrCYS PheOXnClYScrHlaYalProTrpThrPheCXy GlyCXyThrLyaLcuThryaXLfcuAraClnPr 
MlalXX Avail Banl Alul Ba»X HgaX 

ftaal Sau96X MlaXV HlndXXX 

HtlEXX 

360 

CTAACTCCAC CifL 
o«ocLauCln rivl - 

PatX 
HaaXXX 
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10 20 30 40 50 CO 70 

GAACTTCAACTGCAGCAGTCTCOTCCTCyuerTCCTTAAAC^ 
evOLOOSC PEtVXPCXSVRMSCKSS 



BbvZ* 
rnu4H2 

pat: 



AvaXZ 

sau»ez 



Ahazz Mhaz 
BanXMnlZ* MlnPZ 
teostll PapXMUZZZ 



MnlX* 



Mian 

Mhaz 
. HinPZ 
NarZ 
Hlazv 
SerfX 



HapKZ 



as 



95 



I 

105 



115 | 125 125 145 

CCCTACCCCCACTCTCATCCTAAC7CTCTACACTTTAAACCTAACCCCACCCTTACTCTCCACAAATCTTCCTCA 
CYROSHCKSLDTXCKATLTVOKSSS 
Ban! BstXI NlaZXX XbaY ACOl H&oXI- 

KpnX oral Hindi Mnlt* 

HITlV Sail 
RaaZ TaqI 



160 170 180 190 200 RIO { 220 

ACTCCTTACATCCACCTCC C ri CrT XCACCTCTCACCACTCCCCCCTATACTATTCCCCCCCTATCCATTATTCC 
TAYMELRSLTSEDSAVYYCARX OYW 
Alul DdaZ MinfX Accl AeeZZ q;l»t hi 

NlaZZZBbvZ* nnlZ*KnlX- ACClZ AeeZZ T*ql s 

rnu4Ki NapBzz fiiaaii 

SaeZZ KhaX 

MhaZ 
KlnPZ 
Hin?X 



235 245 255 265 

CCCCATCCCCCTACCCTTACCCTCACC7CCTAACCATCC 

CKCASVTVSS'CS 
aZV HaaXX Alul DdaZBamMZ 

au9(Z MhaZ danZZMatZZNlaZv 
HatZZZ MlnPZ Bapl2B6 SauSA 

NeoZ NhaZ HqiAZ XhoZZ 

HlaZZZ SacX 
StyZ 
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10 20 30 <0 50 *° 70 

CAATTCATCCCTCXCAACAAATTCAACAAGCAACACCXCAXCCCCTICTACCAeATCTTCCACCTCCCCXACCTC 

SrMADNKFHKEQQNAFYEILHlPNt 

EeoRX Mlul BglXX BspMH- 

XnnX 

S3 95 103 115 . 125 125 145 

JUtCGAAGAGCAGCGTAACGGCTTCATCCAAACCTTGAAACACCACCC^^ 
NEEQRHGFIQSLKDDFSQSANLLAE 

HindXXX BspMI* 

EC047III 

160 170 180 190 200 210 220 

CCCAAGAAACrGAACGACGCTCAGGCGCCGAAGAGTGXTCCCGAAGTTCAACTCCXCCAGTCTGGTCCTGAATTC 
AXXLNDAQAPKSDPEVQLQQSGPEL 

Karl **tl 

235 245 255 265 275 285 295 

GTTAAACCTGGCGCCTCTGTGCGCATGTCCTGCAAATCCTCTGGGTACATTTTCACCGACTTCTACATGAATTGG 
V KPGASVRMSCKSSGYXFTDFYMMW 

Narl Fspl 

310 320 330 340 350 360 370 

GTTCGCCAGTCTCATGCTAAGTCTCTACACTACATCGCGTACATTTCCCCATACTCTCGGGTTACCGGCTACAAC 
VRQSHGKSLDYICYXSPYSCVTGYN 
BstXX Xbal M1MI BstEII 

385 395 405 415 425 435 445 

CAGAAGTTTAAAGGTAACGCGACCCTTACTGTCGACAAATCTTCCTCAACTGCTTACATCCAGCTCCGTTCTTTG 
QKFXCKATLTVDKSSSTAYMELRSL 
Oral Sail 

460 470 480 490 500 510 520 

ACCTCTGAGGACTCCGCGGTATACTATTGCGCGGGCTCCTCTGGTAACAAATGGCCCATGGATTATTGGGGTCAT 
TSEDSAVYYCAGSSGNK.WAMDYWGH 
SacII " Hcol 

535 545 555 565 575 585 595 

GGTGCTAGCGTTACTGTGAGCTCTGGTGGCGCTCCGTCGGGCGGTGGTGGCTCGGGTGGCGGCGGATCCGACGTC 
GASVTVSSGGGGSGGGGSGGGGSDV 
Khel Sad BanHI Aatll 

610 620 630 640 650 660 670 

GTTGTTACCCAGACTCCGCTGTCXCTGCCGGTTTCTCTGGGTGACCAGGCTTCTATTTCTTGCCGCTCTTCCCAC 
VVTQTPLSLPVSLGDQASISCRSSQ 

BfitEXX PflM 

685 695 705 715 725 735 745 

TCTCTGGTCCATTCTAATGGTAACACTTACCTGAACTGGTACCTGCAAAAGCCTGGTCAGTCTCCGAAGCTTCTG 

SLVHSHGNTYLMWYLQKAGQSPKLL 
X BatXX BepKI4 KindXIX 

Kpnl 

FIG. 6A-1 
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M g OFDlGOXlN BINDING PROTEIN PCT ml 




835 845 855 865 875 885 895 

CTGAAGATCTCTCCnCTCCACCCCCAACACCrGGCTATCTACTrCTGCTCTCACACTACTCATCTACCCCCCACT 
L K X 1 S RVCAC0 L6 ZYFCS QT THV PPT 
Bglll 

910 920 930 940 

TTTCGTGGTGC CACCAAG CTCG AG ATTAAACGTTAACTGCAG 
PGGGTXLSZXR* 

XboZ Hpal Psti 



FIG. 6A-a 



35 



EP 0 623 679 B1 



10 20 30 40 SO 60 

eATCCTGACeTCGTAATGACCCABACTCCCCTBTCTCTCCCGSTTtCTCTOGrrCACCAG 
DPOVVHTQTPtSLPVSLOOO 

Mil ■•«« 

70 60 90 100 110 120 

5CTTCTATTTC7TBCCGCTCTTCCCAGTCTCTGGTCCATTCTAATGGTAACACTTACCTG 
A B 1 6CRB808LVHS NGNTYL 

PflttX BfttXX 

130 140 190 160 170 1B0 

AACTGGTACCTGCAAAAGGCT6GTCA6TCTCCGAAGCTTCTGATCTACAAAG7CTCTAAC 
NWYLQKAGQ8P1CLL1YKV8N 
BspflX* Hindi XX 

Kpnl 

190 200 210 220 230 240 

CGCTTCTCTGGTGTCCCGGATCGTTTCTCTGG1TCTGGTTCTGGTACTGACTTCACCCTG 
RFSGVPDRFSG8GSGTDFTL 

250 260 270 2B0 2*0 300 

AAGATCtCTCGTGTCGAGGCCGAAGACCTGGGTATCTACTTCTGCTCTCAGACTACTCAT 
K XSRVEAEDLGl YFCSQTTH 
BgiXX 

710 320 330 340 350 360 

6TACCGCCGACTTTT66TGGTGGCACCAAGCTCGAGATTAAACGTGGATCTSSAGGTGGC 
VPPTFGGGTKUE X KRGS.BGG 

XhoX 

370 380 390 400 410 420 

GGATCTGGT66AGGTGGCTCTGGTGGC66TGGATCCGAAGTTCAATTGCAGCA6TCTGGT 
GSGGGGSGGGGSEVQLOGSG 

BanHX 

430 440 450 460 470 480 

CCTGAATTGGTTAAACCTGGCGCCTCTGTBCGCATGTCCTGCAAATCC7CTBGGTACATT 
PEUVKPGASVRHSCICSSGYX 
N*rX FftpX 

490 50O 510 520 530 540 

TTCACCGACTTCTACAT3AATTGG6TTCGCCAGTCTCAT6GTAAGTCTCTABACTACATC 
FTDFYWNWVROSHGKSLDY X 

BfttXX Xb*X 

550 560 570 ' 590 590 600 

GGGTACATTTCCCCATACTCTGGGGTTACCGGCTACAACCAGAAGTTTAAAGGTAAG5C6 
GYX SPYSGVTGYNOKFKGKA 
PfXMX BmtEIX . DraX * 

610 620 630 640 630 660 

ACCCTTACTG7CGACAAATCTTCCTCAACT6CTTACATGGAGCT6CGTTCTTTGACCTCT 
TLTVD KSSSTAYHELRSLTS 
S*1X 

670 680 690 70O 710 720 

GAGGACTCC6CGGTATACTATTGC6CG6GCTCCTCTGGTAACAAATGGGCCATGGATTAT 
EDSAVYYCAGSSGNKWAI1&Y 
Qmzll NcoX 

730 740 750 760 C \ U l*B 

TGGGGTC AT GGTGCTAGC6TT ACT GT B ASCTCTT AACT GC AG • 1 
W G H 5 A 8 V T V S S • 
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S 8 

i 1 1 i — 1 1 i r 1 « ^ 

8.888889882 
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EP 0 623 679 B1 



10 20 30 40 50 60 

GAACTTCAACTCCACCACTCTCCTCCTCCATTCCTTCCACCTTCCCACACTCTCTCCCTC 
EVQLEQSGPCLVRPS 0 T L S L 

70 80 90 100 110 120 

ACCTCCACATCCTCTCCCTAC^TITTCACCCACTTCTACATCAATTCCCTTCCCCACCCT 

TCTSSGYXFTDFYMNWVRQP 
BapMX* BBtXX 

130 140 150 160 170 180 

CCTGGTCGGGGTCTAGACTACATCGGGTACATTTCCCCATACT 
PGRGLDYIGY XSPYSGVTGY 
Xbal PflMZ BatEXX 

190 200 210 220 230 240 

AACCAGAAGTTTAAAGGTAAGGCGACCCTTCTGGTCAACAAATCTAAGAACCACCCTTCC 
NQKFKGKATL1VNKSKHQAS 
Oral 

250 260 270 280 290 300 

CTGCGGCTCTCTTCTGTCACCGCTGCGGACACCGCGGTATACTATTGCCCGGCCTCCTCT 
tRLSSVTAADTAVYYCAGSS 

SaelZ 

310 320 330 340 350 360 

GGTAACAAATGGGCCATGGATTATTCGGGTCAGGGTrCTCTGGTTACTGTGACCTCTGGT 
CHKWAMDYWGQCSLVTVSSG 

NcoX Sad 

370 380 390 400 410 420 

GGCGSTGGGTCGGGCGGTGGTGGCTCGGG7GGCGGCGGATCCGACGTCGTTATGACCCAG 
GGGSGGGGSGGGGSDVV HTQ 

BaaHX AatXI 

430 440 450 460 470 480 

CCTCCCTCGCTTTCGGGGGCTCCTGGTCAGCGGGTTACTATTTCTTGCCGCTCTTCCCAG 
PPSVSGAPGQRVTXSCRS5Q 

PflM 

490 500 510 520 530 540 

TCTCTCGTCCATTCTAATGGTAACACTTACCTGAACTGGTACCAGCAACTGCCTCGTACG 

5LVHSHGNTYLHW YQQLPGT 
X BstXX KpnX 

550 560 570 580 590 600 

GCTCCGAAGCTrCTGATCTACAAAGTCTCTAACCGCTTCTCTGGTCTCCCGGATCGTTTC 
AFKX#X#XYKVSWRFSGVPDRF 
HindXXX 



610 620 630 640 650 660 

TCTGGTTCTGGTTCTGGTACTGACTTCACCCTCGCGATCACTCGTCTCCACGCCGAAGAC 
SGSG . SGTDFTLAITGLQAED 

670 680 690 700 710 720 

GAGGCTGACTACTTCTGCTCTCAGACTACTCATGTACCCCCGACTTTTGGTCCTGGCACC 
E A DY FCS QTTHVPPTFG GGT 

730 740 750 

AAGCTCACGGTTCTCCGTTAACTGCAG 
KLTVLR + LQ' 
Hpal PstX 



f\Gs. 
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GAATrCCWtCreCAACTGCAGttOTCCTCCT^ 

AJtUlZ P»« M * rX ™ 

EceftX 

70 eo »o xoo no 120 

CCCATCTCCT6CAAATCCTCTCGCTACXCCXTCJICC*ACTA1TACXTCCACTWCTTAW 
RHSCKSSGYTPTHYYXH ■ IK 

pi 

13 ft 140 130 X«0 X70 WO 

CAGTCTCATGGTAAGTCTCTAGAGTGGATCCGTTGGATTTACCWGGTAA^ 
QSH6XSL SWZ6WZ Y P C H C W T 

Xtoal 

190 200 210 3*0 230 240 

MGTACAATGAGAACTITAAAGGTAAGCCGACCCTTACIGTCCACAAATCTrCCTCAACT 
KVNEHPKGKATtT V 0 X S S S T 
Oral *»** 

J50 260 270 2»0 290 200 

GCTTACATGGAGCTCCCTTCTrTOACCTCTCACGACTCCCCCGTATACTATTCKMCGT 

3 10 320 "0 540 350 3*0 

TACACTCATTATTACTTCGATTACTGGGCCCATGGCGCTAGCCCTAJ^ 
VTHYYFDYWGHGASVTVSSG 

Y T H Y ncoI Nh«X S*cX 

370 380 390 400 410 420 

6GCCCTGGCTCGGCCGGTCGTGGGTCGGGTGGCCGCGGATCCGACCTCCWATCACCCAG 
CC6SCGGCSGGGGSDVVHTQ 

430 440 450 460 470 

ACrCCGCTCTCTCTCCCGGTTTCTCTCGGTGACC^GGCTTCTATrrcrrC 
TPX.SLPVSLC0QASZSC)RS5 

BstEXX 

490 500 510 520 530 540 

CAGTCTATCGTCCATTCTAATGGTAACACTXACCTGGAGTGGTACCTGCAAAAGGCTGGT 
QS XVKSMGNTYLEWYt QKAC 

™ Kpnl 

550 560 570 510 590 600 

CAGTCTCCGAAGCTrCTGATCTACAAAGTCTCTAACCGCTTCTCTCGTGTCCCCGATCCT 

QSPKLLXYKVSMRFSGVPDR 
KindZXX 

610 620 630 640 650 660 

TYCTCTG G 'IV'CX' GG T T CT G GTACTGACTTCACCCTGAACATCTCTCGTCTCCACCCCCAG 
PSCSGSCTOFTt S R V E A E 

6 70 680 690 700 710 720 

GATCTGCGTATCTACTACTCCTTCCAAGGGTCTCATGTACCGTGCACXTTCCGCGGTG6G 
0 L G X Y Y CFQCSHVPWTFC66 

730 740 750 ft 

ACCAAGCTCGAGATTAAACCTTAACTGCAG C I Cm ID 

TXLEIXR*tQ ^ lV3U 

Xhol Hpal PstX 
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10 20 30 40 50 60 

GATCCCGAGGTrATGCTGGTrGAATCTGGTGCAGTACTGATCCAACCTGGTGCGTCCCTG 
DPEVMLVESCCVLMEPCGSL 

Seal EcoO 

70 80 90 100 X10 120 

AAGCTCACCTCTCCTCCTACCCCCTTCACCTTCTCTCCTrACCCCATCTCTTCCCTCCCT 
KLSCAASGFTFSRY AMSWVR 

Eapl Nhal PflMI 

130 140 ISO 160 170 180 

CACACTCCGGACAAGCGTCTACAGTGGGTCGCGACGATATCTTCTGGTCCTTCTCACACG 
QTPEKRLEWVATISSGCSHT 
N BapMXZ XbaX Nrul EcoRV 

190 200 210 220 230 240 

TTCCATCCAGACAGTGTGAAGCGTCGATTCACGATCTCTCGAGACAACGCXAAGAACACG 
FHPDSVK CRFTISRDNAKNT 

Xhol 

250 260 270 280 290 300 

TTGTACCTGCAAATGTCTTCTCTACCTAGTGAAGATACTGCTATGTACTACTCTGCACGT 
LYLQMSS LRSEDTAMYYCAR 
BspMI* SnaBX ApaU 

310 320 330 340 350 360 

CCTCCACTGATCTCACTAGTTGCTGATTATCCCATGCATTATTGGGGTCATGGTGCTAGC 
PPLISLVADYAMDYWGHGAS 
Spez Keol Nhel 

370 380 390 400 410 420 

GTTACTGTCAGCTCTGCTGGCGGTGCGTCGGGCCCTGGTGGCTCGGGTGGCCGCGGATCG 
VTVSSGCGCSGGGGSGGGGS 

Sad 

430 440 450 460 470 480 

GATATCGTTATGACTCAGTCTCATAAGTTCATGTCCACTTCTCTTCGTGACCGTGTrrCX 

OXVMTQSHKFMSTSVCDRVS 
EcoRV BstEII 

490 500 510 520 530 540 

A7CACTTG7AAGGCCAGCCAGGA7GTGGGTGCTGCTATCGCATGGTATCAGCAGAAGCCC 
ITCKASQDVGAAIAWYQQKP 
PflHI Saa 

550 560 570 580 590 600 

GGGCACTCTCCTAACCTCCTCATCTACTGGGCGTCGACTCCTCATACTGGTCTCCCCCAT 

GQSPKLL2YWASTRHTCVPD 
2 Sail 

610 620 630 640 650 660 

CGTTTCACTGGGTCCGGATCAGGTACTGATTTCACTCTGACTATTTCGAACCTTCAGTCT 
RFT6SGSGT0FTLTZ5NVQS 
BspMlI AcuIJ 

670 660 690 700 710 720 

GATGACCTGGCTGATTACTTCTCCCAGCAATATTCCGGCTACCCTCTGACTTTCGCTGCC 
DDLADYFCQQYSGYPLTFC.A 

SspX Kpnl Nae 

730 740 750 „ nr Qk 

CGCACTAAACTCGAGCTGAACTAACTGCAG r i VJl • * 12 

6TKLELK* 
X Xhol Pstl 
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10 20 « 40 50 60 

CATCCCCACOTTATCCTCGTrCAATCTCOTCCACTACTOATGCAACCTCCTCGGTCCCTC 
D P E V M L V E S G L M E P 6 L 

70 80 »0 100 110 120 

AAGCTGAGCTGTGCTGCTACCGGCTTCXCGTTCTCTCCTTACCCCATGTCTTGCGTCCCT 

KLSCAASG FTFSRY A M S W v Jt 
E*pl WwX * fUa 

130 140 150 160 1^0 180 

CACACTCCGCAGAAGCGTCTAGAGTGGGTCGCGAC^TATOT^ 

° T B.?KI1 K "Xbil* " AlUlI 

ion 200 210 220 230 240 

tactatcSgacagtgtgaaggctcgattcacgatctctcgagacaacgctaa^ 
yyposvkgrftx s^r d h a x h t 

250 260 270 260 290 300 

ttgtacctgcaaatgtcttctctacgtagtgaagatacigctatgtactactgtg^cgt 

LYLQMSSLRSEDTAMYY C A R 
BspMI+ SnaBI A P 8 « 

310 320 330 340 350 360 

CCTCCACTGATCTCACTACTTGCTGATTATGCC^TGGATTATTGCGGTCATGGTGCTAG^ 
PPL1SLVADYAM0YWCHG A S 
Spel Mcol »»»X 

370 380 390 400 410 420 

GTXACTGTGAGCTCTGGTGGCGGTGGGTCGGGCGGTCCTGGCTCGGGTCGCCGCGGATCG 
VTVSSGGGGSGGGGSGGGG5 

sael 

430 440 450 460 470 480 

GATATCGTTAT6ACTCAGTCTCATAAGTTCAT6TCCACTTCT6W6GTCACCGTGTTTCT 

01VMTQSH KFMSTSV G O R V S 
ECORV BSttM 

490 500 510 520 530 540 

ATCACTTGTAAGGCCAGCCAGGATGTGGGTGCTGCTATCCCATMTATCAGCAGAAGCCC 

ITCXASQDVCAAIAHYQQK F 
PIlMI Stta 

550 560 570 580 590 600 

CGGCAGTCTCCTAAGCTGCTGATCTACTGGGCGTCGACTCGTCATACTGGTCTCCCGGAT 

GQSPKltX»2 YWASTRHTGVPD 
2 Sail 

610 620 630 640 650 660 

CGTTTCACTGGCTCCGGATCAGCTACTGATrTCACTCTGACTATTrCGAACGTTCAGTCT 
RFT3S GSGTDFTX.T1SNVQS 

BspMIZ A8«W 

S70 680 690 700 710 720 

CATGACCTGGCTCATTACrrCtGCCAGCAATATTCCGGGTACCCICTGACTTTCGGTGCC 
DDLADYFCQQYSGYPLTFG A 

SspZ Xpnl »» e 

730 740 . 750 ri RE 

GGCACTAAACTCGAGCTGAAGTAACTCCAG * 

GTKLELK* 
1 Xhol PBtl 
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10 20 

Met Lys Ala 21a Phe Val Leu Lys Gly Ser Leu Asp Arg Asp Leu Asp Ser Arg Leu Asp 

ATG AAA CCA ATT TTC CTA CTC AAA OCT TCA CTC CAC ASA CAT CTG CAC TCT CCT CTC CAT 

BglXI 

30 40 
Leu Asp Val Arg Thr Asp His Lys Asp Leu Ser Asp His Leu Val Leu Val Asp Leu Ala 
CTC CAC CTT CCT ACC CAC CAC AAA CAC CTC TCT GAT CAC CTC CTT CTC CTC CAC CTC CCT 

Bell Sail 
50 60 
Arg Asn Asp Leu Ala Arg Xle Val Thr Pro Gly Ser Arg Tyr Val Ala Asp Leu Clu Phe 
CCT AAC CAC CTC CCT CCT ATC CTT ACT CCC CCC TCT CCT TAC CTT CCS CAT CTC CAA TTC 

SmaX EcoRI 

asp f»Cn. JO A 

GAT 



EcoRI 

Sspl 




AflH 
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94- 

* II: 

: 29- 

^ 20.1 - 
14.4- 



3 



0 1^34 5 



S5I[5J5iIi?55;SEHQFFLKLDSVTTAT 
YYCACOHDHLYFOyWCOCTTLTVS 

CCGCSGGGGSGGGGS 

QAVVTQESALTTSPCGTVILT 
TSNYANWIQEKPDHLFTGLIG 
PVRFSCSLIGOKAALTITGAO 
ALWFRNHFVFGGGTKVTVLG 



CRSSTGAVT 
CTSNRAPGV 
TEDDAMYFC 



FIG. 9C 
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' I ! 1 1 — : 1 1 I 

ic r «r° v icr* iO' f icr* v 

INHIBITOR CONCENTRATION f Mj 
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LOG UNBOUND DIGOXIN CONCENTRATION IM) 
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10 20 30 40 50 60 

GAATTCATGGCTGACAACAAATTCAACAAGGAACAGCAGAACGCGTTCTACGAGATCTTG 

E F M A 0 H K P H K EQQHAFYEIL 
EcoRX MluX BglXX 

xanX 

70 80 90 100 110 120 

CACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAAAGCTTGAAGGATGA6 
HL PNLNEEQRMGFIQSLKDE 
BspMI+ HindlXX 

130 140 150 160 170 180 

CCCTCTCAGTCTGCGAATCTGCTAGCGGATGCGAAGAAACTGAACGATGCCCACGCACCG 
PS'Q SANLLADAKXLNDAQAP 

Nhel Fspl 

190 200 210 220 230 240 

AAATCGGATCAGGGGCAATTCATGGCTGACAACAAATTCAACAAGGAACAGCAGAACGCG 
KSDQGQFMADMKFNKEQQNA 

Mlul 
XnnX 

250 260 270 280 290 300 

TTCTACGAGATCTTGCACCTGCCGAACCrGAACGAAGAGCAGCGTAACGGCTTCATCCAA 
FYEILHLPNLNEEQRN GFIQ 
BglXX BspMI* H 

310 320 330 340 350 360 

AGCTTGAAGGATGAGCCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAAC 

SLKDEPSQSAHLLADAKKLN 
indlll Nhel 

370 380 Cl £- id 

GATGCGCAGGCACCGAAATCGGATCC r i «l . ^ 

DAQAPKSDP 
FspX BaaHX 
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( BABS ) - 7d 

bSbHI C H S B«nX* E»PI 

10 5 US 1« 135 —*15. 



Sphl 

AAATCGTCGCAGCTCCCTTAACTCCAG 

K W W E L R * 

Hpal PStI 
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<BABS>- 

10 20 30 40 30 60 

CSATCCOCTCSCGACCCGTCCAAGGACTCaaAfiCTCACCTTTCTCCTCCCCAACCTCCT 
GS GGDPSXDSKAQVSAAEAG 

B&aHX 

70 80 90 100 U0 120 

ATCACTGGCACCTGGTATAACCAACTGGGGTCGACTTTCATTGTGACCGCTGGTGCGGAC 
ITGTWYMQX.GSTP1V TACAD 

Sail 

130 140 ISO 160 170 180 

GGAGCTCTGACTGGCACCTACGAATCTGCCGTTGGTAACGCAGAATCCCGCTACCTACTG 
GAt.TGTYESAVGMArS.RYVX. 

Sad SnaBI 

190 200 210 220 230 240 

ACTGGCCGTTATGACTCTGCACCTGCCACCGATGCCTCTGGTACCGCTCT6GGCTGGACT 
TGRYDSAPATDGSGTALGW. T 

BspMI+ KpnX 

250 260 270 280 290 300 

GTGGCTTGGAAAAACAACTATCGTAATGCGCACAGCGCCACTACGTGCTCTGCCCAATAC 
VAWKMMYR. MAHSATTWSGQY 

Fspl OralXI Ball 

PrlMI BstXI 

310 320 330 340 350 360 

GTTGGCGGTCCTGAGCCTCGTATCAACACTCAGTGGCTGTTAACATCCGGCACTACCGAA 
VCGAEAR1HTQWX.LTSGTTE 

OralZI Hpal 

370 380 390 400 410 420 

CCGAATGCATGGAAATCGACACTAGTACGTCATGACACCTTTACCAAAGTTAAGCCTTCT 
AN AW KSTLVGHDTFTKVXPS 
BSBI+ spel 
Nsil 

430 440 450 460 470 480 

GCTGCIAGCATTGATGCTGCCAAGAAAGCAGGCGTAAACAACGGTAACCCTCTAGACGCT 
A A S I DA A KKAGVNKGN PtDA 
Nhel BstEII Xbal 

490 500 

GTTCAGCAATAACTGCAG c . r iC R 

v q q * rival. 

PstI 
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(BABS) - 

30 40 50 60 

BamHI SnaBI 

an 90 100 U0 120 

AHPQAEG QLQWi« x 
Mstll 

A N sad P«1MI K P nI 

200 210 220 230 240 



seal 

250 



260 270 280 290 300 



HTISRIAVSYQT HpaIBspMI+ EC047III 

Espl 

<*on 390 40° 420 

BstEII 

cgtcctgaISatctag^^ 
R P ° *XbaI ° Ball 

4 !° CiCn »5C 

* 

PstI 



50 



EP 0 623 679 B1 



(BABS) - 

10 20 30 40 50 60 

GGATCCGGTGCTGATCAGCTGACTGACGAGCAGATCGCTGAATTTAAAGAGGCTTTCTCT 

GSGADQLTDEQIAEFKEAFS 
BataHI BclIPvuZI Oral 

70 80 90 100 110 120 

ctgtttgacaaagacggtgacggtaccatcactaccaAagagctcggcaccgttatgcgc 

LFDKDGDGTITTKELGTVM R 

Kpnl SacI Fspl 

130 140 150 160 170 180 

AGCCTTGGCCAGAACCCGACTGAAGCTGAATTGCAGGACATGATCAACGAAGTCGACGCT 

slgqmpteaelqdminevda 

Ball Boll Sail 

190 200 210 220 230 240 

GACGGTAACGGCACCATCGATTTTCCGGAATTTCTGAACCTGATGGCGCGCAAGATGAAA 
OGMGTI OFPEFLNLMARKMK 
ClaZ BspMIZ BSSHIZ 

250 260 270 280 290 300 

GACACTGACTCTGAAGAGGAACTGAAAGAGGCCTTCCGTGTTTTCGACAAAGACGGTAAC 
DTD SEEELKE AFRVFDKDGH 

stui 

310 320 330 340 350 360 

GGTTTCATCTCGGCCGCTGAACTGCGTCACGTTATGACTAACCTGGGTGAAAAGCTTACT 
GFISAAELRHVMTNLGEKLT 
EagI Hindlll 

370 380 390 400 410 420 

GACGAAGAAGTTGACGAAATGATTCGCGAAGCTGACGTCGATGGTGACGGCCAGGTTAAC 
DEEVDEMI R EAD VDGDGQVN 
Xmnl Nrul Aatll Hpal 



430 440 450 

TACGAAGAGTTCGTTCAGGTTATGATGGCTAAGTAACTGCAG C | ftt 1 5 D 

YEE FVQVMMAK* r«w. 

PstI 
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( BABS ) • 4Q 50 60 

GGATCCGGTGGAGGCTCTCTCGGCTCTCTGACT j A E P A M I A E 
GSGGGSLGSi.* 



BaaHI , 2 0 

90 100 HO 



TGCAA6ACTCGTACCGAA6TCTTC<SAGATCTCTCCT 
C K T R T E V F BIS ^ 



170 180 

130 

H F L T*«" P * C V E 



520 230 240 

» 0» !iS..r.rT=MTCCOSICOM<!TCCaClA(ATe 



R " V ' ™" 

2fl0 290 

250 fSr^CTTT^AAGGCCACTGTTACTCTGGAAGACCATCTG 
GAGATTGTACGTAAGAAACCGATCTTTAAGAAG vTL jDHL 

SnaBI 

130 340 350 

AC « C E TV A^A pstl 

FIG,. ISE 
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(BAB5) - 

10 20 30 40 50 60 

GGATCCGGTATATTCCCCAAACAATACCCAATTATAAACTTTACCACAGCGGGTGCCACT 

GSGZFPKQYPI INFTTAGAT 
BanHZ 

70 80 90 100 110 120 

GTGCAAAGCTACACAAACTTTATCAGAGCTGTTCGCGGTCGTTTAACAACTGGAGCTGAT 
V Q S Y T N F I R A V R G R L T T G A 0 

130 140 150 160 170 180 

GTGAGACATGAAATACCAGTGTTGCCAAACAGAGTTGGTTTGCCTAtAAACCAACGGTTT 
VRHEZP VLPNRVGLPZNQRF 

190 200 210 220 230 240 

ATTTTAGTTGAACTCTCAAATCATGCAGAGCTTTCTGTTACATTAGCGCTGGATGTCACC 
1LVELSNHAELSVTLALDVT 

EC047IZI 

250 260 270 280 290 300 

AATGCATATCTGGTCGGCTACCGTGCTGGAAATAGCGCATATTTCTrTCATCCTGACAAT 
NAYVVGYRAGNSAYFFHPDN 

Ndel 
Nsil 

310 320 330 340 350 360 

CAGGAAGATGCAGAAGCAATCACTCATCTrTTCACTGATGTTCAAAATCGATATACATTC 
QEDAE AI THLFTD VQNRYTF 

Clal 

370 380 390 400 410 420 

GCCTTTGGTGGTAATTATGATAGACTTGAACAACTTGCTGGTAATCTGAGAGAAAATATC 
AFGGNYDRLEQLAGNLREN I 

430 440 450 460 470 480 

GAGTTGGGAAATGGTCCACTAGAGGAGGCTATCTCAGCGCTTTATTATTACAGTACTGGT 
ELGNGPLEEAZSALYYYSTG 

EC047ZZZ Seal 

490 500 510 520 530 540 

GGCACTCAGCTTCCAACTCTGGCTCGTTCCTTTATAATTTGCATCCAAATGATTTCAGAA 
GTQLPTLARSFZZCZQMZSE 

550 560 570 580 590 600 

GCAGCAAGATTCCAATATATTGAGGGAGAAATGCGCACGAGAATTAGGTACAACCGGAGA 
AARFQYZ EGEMRT RZRYNRR 

FspZ Bgl 
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(BABS) - 50 60 

2 ^5 l^^^^l^t^* 
BasHX 

R H L T F K F « 
CLEEE1' KPLE * ScaI 

290 300 



F H It R ^ Bell ""^ 

OSE TTFMCEXA 

190 400 410 420 

f\(n. >5&i 
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(BABS)- 

10 20 30 40 50 60 

GGATCCGGTGCTGACAACAAATTCAACAAGGAACAGCAGAACCCGTTCTACGAGATCTTG 

GSGADNKFNXEQQNAFYEZL 
BamHI Mlul BglZZ 

XnnI 

70 80 90 100 110 120 

CACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAAAGCTTGAAGGATGA6 
HLPNLKEEQRNGPZQ8LX.DE 
BspMZ+ Hindlll 

130 140 150 160 170 180 

CCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAACGATGCGCAGCCACC6 
PSQSANLLADAXXLNDAQAP 

NheZ FspZ 

190 200 210 220 230 240 

AAATCGGATCAGGGGCAATTCATGGCTGACAACAAAT7CAACAAGGAACAGCAGAACGCG 
K S DQGQFMADNXFNKEQQNA 

M1UZ 
Xmnl . 

250 260 270 280 290 300 

TTCTACGAGATCTTGCACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAA 
F YEZLHLPNLNEEQRNGFZQ 
BglZZ BspMI* H 

310 320 330 340 350 360 

AGCTTGAAGGATGAGCCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAAC 

SLXDEPSQSANLLADAXXLN 
indZZZ NheZ 

370 380 r\r ISH 

GATGCGCAGGCACCGAAATAACTGCAG P I \J\ . 
D A Q A P X * 
FspZ ■ PstZ 
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